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Abstract 

Fast  Fourier  Transforms  for 
Nonequispaced  Data 


Alok  Dutt 
Yale  University 
1993 


Two  groups  of  algorithms  are  presented  generalizing  the  fast  Fourier  transform  (FFT) 
to  the  case  of  non-integer  frequencies  and  nonequispaced  nodes  on  the  interval  [— tt,  tt]. 
These  schemes  are  based  on  combinations  of  certain  analytical  considerations  with 
the  classical  fast  Fourier  transform,  and  generalize  both  the  forward  and  backward 
FFTs.  The  number  of  arithmetic  operations  required  by  each  of  the  algorithms  is 
proportional  to  N  •  log  A’  -t-  N  •  log(l/£),  where  e  is  the  desired  precision  of  compu¬ 
tations  and  N  is  the  number  of  nodes.  Several  related  algorithms  are  also  presented, 
each  of  which  utilizes  a  similar  set  of  techniques  from  analysis  and  linear  algebra. 
These  include  an  efficient  version  of  the  Fast  Multipole  Method  in  one  dimension  and 
fast  cilgorithms  for  the  evaluation,  integration  and  differentiation  of  Lagrange  polyno¬ 
mial  interpolants.  Several  numerical  examples  axe  used  to  illustrate  the  efficiency  of 
the  approach,  and  to  compare  the  performances  of  the  two  sets  of  nonuniform  FFT 
algorithms. 
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Chapter  1 
Introduction 


Fourier  techniques  have  been  a  popular  analytical  tool  in  physics  and  engineering  for 
more  than  two  centuries.  A  reason  for  this  popularity  is  that  the  trigonometric  func¬ 
tions  e*"®  axe  eigenfunctions  of  the  differentiation  operator  and  thus  form  a  natural 
basis  for  representing  solutions  of  many  classes  of  differential  equations. 

More  recently,  the  arrival  of  digital  computers  and  the  development  of  the  feist 
Fourier  transform  (FFT)  algorithm  in  the  1960s  (see  [8])  have  established  Fourier 
analysis  as  a  powerful  and  practical  numerical  tool.  The  FFT,  which  computes  dis¬ 
crete  Fourier  transforms  (DFTs),  is  now  a  central  component  in  many  scientific  and 
engineering  applications,  most  notably  in  the  areas  of  spectral  analysis  and  signal 
processing.  Numerous  applications,  however,  involve  unevenly  spaced  data,  whereas 
the  FFT  requires  input  data  to  be  tabulated  on  a  uniform  grid.  In  this  dissertation, 
a  collection  of  algorithms  is  presented  which  overcome  this  limitation  of  the  FFT 
while  preserving  its  computational  efficiency.  These  algorithms  are  designed  for  the 
efficient  computation  of  certain  generalizations  of  the  DFT,  namely  the  forward  and 
inverse  transformations  described  by  the  equations 

=  (1.1) 

*=0 


for  j  =  0, N ,  where  €  C,  €  C,  €  [— A^/2,A^/2]  and  xj  €  [— 7r,x].  The 
number  of  arithmetic  operations  required  by  each  of  the  algorithms  is  proportional 
to 

iV-IogA'-HiV.log0  (1.2) 

where  e  is  the  desired  accuracy,  compared  with  0{N^)  operations  required  for  the 
direct  evaluation  of  (1.1)  and  0{N^)  for  the  direct  inversion. 
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Remark  1.1  The  DFT  can  be  described  by  either  of  the  two  closely  related  formulae 

/.  =  E  (1-3) 

fc=0 


for  j  =  0, . . . ,  jV  —  1,  and 

N/2-1 

/,=  ^  (1.4) 

k=-N/2 

for  j  =  —N/2, . . . ,  N/2  —  1.  While  the  form  (1-3)  is  normally  used  when  the  FFT  is 
discussed,  the  form  (1.4)  is  usually  preferred  in  applications  of  the  DFT  to  analysis 
(see,  for  example,  [3],  [12]). 


Remark  1.2  The  FFT  algorithm  reduces  the  number  of  operations  for  the  DFT 
from  0{N'^)  to  0{N  ■  logA^)  by  a  sequence  of  algebraic  manipulations  (for  more 
details  on  FFTs,  see  [5],  [8],  [9],  [23],  [24],  [25]).  In  the  more  general  case  of  (1.1), 
the  structure  of  the  linear  transformation  can  also  be  exploited  via  a  combination  of 
certain  analytical  results  with  the  FFT,  and  the  algorithms  of  this  thesis  are  based 
on  thi'  fact. 


Remark  1.3  All  algorithms  described  in  this  thesis  are  approximate  ones  in  the 
sense  that  they  compute  a  vector  /  such  that 

<£,  (1. 

where  the  vector  /  is  the  exact  result  of  the  calculation.  We  will  derive  error  bounds 
for  all  approximations  used  by  the  algorithms,  thereby  allowing  us  to  perform  numer¬ 
ical  computations  to  any  specified  accuracy  e. 


II/-. 

ll/ll 


A  somewhat  extensive  literature  is  devoted  to  the  development,  improvement  and 
implementation  of  classical  FFT  schemes,  whereas  the  numerical  analysis  of  general¬ 
izations  of  the  DFT  appears  to  be  relatively  incomplete.  The  problem  of  interpolating 
Fourier  series  from  irregular  to  regular  grids  has  arisen  in  numerous  areas,  computer 
assisted  tomography  to  mention  just  one.  Traditional  attempts  to  speed  up  com¬ 
putations  for  problems  of  this  type  combined  linear  or  other  low-order  polynomial 
interpolation  methods  with  filtering  techniques.  The  effectiveness  of  this  approach 
is  limited  in  part  by  the  low  accuracy  of  the  interpolation  scheme  used  (see,  for 
example,  [6],  [20],  [21]).  Another  generalization  of  the  DFT,  referred  to  as  the  Frac¬ 
tional  Fourier  Transform,  is  addressed  by  Bailey  and  Swarztrauber  [3]  who  describe 
an  algorithm  for  efficiently  computing  the  numbers 


N-l 

/j  =  ^  a*  •  e 
Jfe=0 


2TrikjX/N 


(1.6) 
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for  j  =  0, . . . ,  —  1,  where  A  is  any  complex  constant.  However,  this  algorithm  does 

not  permit  unevenly  spaced  input  data. 


1.1  Outline  of  the  Dissertation 

The  principal  results  of  this  thesis  are  two  groups  of  efficient  algorithms  for  computing 
FFTs  for  nonuniformly  spaced  data  to  any  specified  precision.  In  addition  the  thesis 
contains  a  number  of  secondary  results  upon  which  the  main  algorithms  are  based. 
Portions  of  this  work  have  been  published  previously  [10],  [11]. 

One  of  the  groups  of  nonuniform  FFT  algorithms  views  the  problem  as  a  form 
of  polynomial  interpolation  which  can  be  solved  efficiently  utilizing  a  fast  multipole- 
type  method  in  one  dimension.  These  algorithms  require  some  preliminary  results 
which  are  presented  in  Chapter  2.  In  this  chapter,  the  problem  of  evaluating  the 
Lagrange  interpolating  polynomial  on  the  real  line  is  discussed.  An  efficient  Fast 
Multipole  Method  (FMM)  for  one  dimensional  problems  is  first  developed  whose 
computational  cost  is  0{N)  arithmetic  operations.  Following  this  is  a  description  of 
an  adaptive  version  of  this  algorithm  which  requires  0{N)  operations  independently 
of  the  distribution  of  the  points  on  the  real  line.  The  chapter  ends  by  discussing 
how  these  FMM  algorithms  can  be  used  for  the  efficient  evaluation,  integration  and 
differentiation  of  Lagrange  polynomial  interpolants. 

In  Chapter  3  we  describe  a  set  of  four  algorithms  for  nonuniform  FFTs  which  are 
ba^ed  on  a  combination  of  the  classical  FFT  with  a  scheme  for  the  rapid  evaluation  of 
trigonometric  interpolants.  This  interpolation  scheme  is  based  on  the  one  dimensional 
FMM  of  Chapter  2,  and  is  used  to  interpolate  from  uniform  to  nonuniform  grids  and 
back. 

A  second  set  of  algorithms  for  nonuniform  FFTs  is  presented  in  Chapter  4.  This  set 
of  algorithms  also  uses  the  classical  FFT,  but  uses  a  different  interpolation  technique 
bcised  on  the  Fourier  analysis  of  the  Gaussian  bell. 

Each  of  the  algorithms  described  in  this  thesis  has  been  implemented  and  tested 
for  a  variety  of  problem  sizes  and  input  data.  Chapter  5  contains  details  of  these 
implementations  and  results  of  several  numerical  experiments  to  demonstrate  the 
performance  of  the  two  groups  of  nonuniform  FFT  algorithms. 

Finally,  Chapter  6  lists  several  generalizations  and  conclusions. 
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Chapter  2 

Polynomial  Interpolation  and  the 
FMM 


Polynomials  form  the  theoretical  basis  for  many  areas  of  applied  mathematics.  While 
the  mathematical  properties  of  polynomials  have  been  quite  well  understood  for  over 
a  century,  attempts  to  use  them  in  practical  calculations  have  met  with  difficulties. 
Typical  problems  accompanying  the  employment  of  classical  schemes  are  those  of 
prohibitive  computational  cost  and  numerical  instability.  However,  certain  classes 
of  orthogonal  polynomials  do  have  stable,  fast  transforms  associated  with  ther.  i,  the 
most  popular  being  Chebyshev  polynomials  which  can  be  manipulated  in  a  stable 
and  efficient  manner  via  the  fast  cosine  transform  or  FFT. 

In  this  chapter  we  present  a  group  of  three  algorithms  for  the  interpolation,  in¬ 
tegration  and  differentiation  of  functions  tabulated  at  nodes  other  than  Chebyshev. 
The  first  of  these  algorithms  takes  as  input  a  set  of  points,  {xi, . . . ,  lAf},  and  a  set  of 
function  values,  {/i, •  •  • , /v}!  and  evaluates  the  unique  interpolating  polynomial  of 
degree  TV  —  1  at  the  points  {j/i, . . .  ,yiv}  for  a  computational  cost  proportional  to  N. 
The  other  two  algorithms  perform  spectral  integration  and  differentiation  of  this  in- 
terpolant.  We  will  also  describe  three  efficient  versions  of  the  Fast  Multipole  Method 
(FMM)  for  one  dimensional  problems;  these  algorithms  are  used  by  the  polynomial 
interpolation  algorithm  of  this  chapter.  Throughout  the  chapter  we  will  be  using  the 
well  known  Lagrange  representation  of  interpolating  polynomials  which  is  defined  by 
the  formula 


N  N 

= E  A  ■  n 

J=1  k=\ 


(2.1) 
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A  simple  algebraic  manipulation  converts  (2.1)  to  the  form 

p«(x)= n(x 

fc=l  i=l  -^3  ;t=l  J 

I^¥^3 

which  can  be  evaluated  at  N  points  in 

0(iV.log(i))  (2.3) 

arithmetic  operations  using  the  Fast  Multipole  Method  (FMM)  of  [16]  (e  here  is 
the  desired  accuracy).  In  comparison,  a  direct  evaluation  of  (2.2)  requires  O(N^) 
operations. 

Remark  2.1  A  somewhat  different  classical  polynomial  interpolation  problem  con¬ 
sists  of  determining  a  set  of  parameters  {oi, . . . ,  a;v}  such  that,  for  j  =  1, . . . ,  A^, 

H  Ufc  •  =  fj  (2.4) 

*=j 

where  {xi, . . . ,  is  a  given  a  set  of  points  and  {/i, . . . ,  /n}  is  a  given  set  of  function 
values.  This  problem  is  highly  ill-posed  for  anything  other  than  very  small  values  of 
N  and  this  formulation  is  seldom  used  when  aetual  calculations  are  being  performed. 

Remark  2.2  The  Lagrange  interpolation  formula  has  traditionally  been  less  favored 
for  practical  calculations  than  other  classical  methods  (see,  for  example,  [23]).  How¬ 
ever,  the  algorithms  of  this  chapter  are  numerically  stable  and  very  efficient,  as 
demonstrated  by  our  numerical  experiments,  thus  affording  the  Lagrange  formula 
a  substantial  advantage  over  other  techniques  for  the  manipulation  of  polynomials. 

Following  is  a  plan  of  the  chapter.  The  first  three  sections  are  devoted  to  efficient 
versions  of  the  FMM  which  can  be  used  to  evaluate  expressions  of  the  form  (2.2); 
Section  2.1  contains  a  number  of  results  from  analysis  which  are  used  in  the  design 
of  these  algorithms,  in  Section  2.2  we  present  the  FMM  algorithm  itself,  and  in 
Section  2.3  we  present  an  adaptive  version  of  this  algorithm.  A  brief  discussion  of 
versions  of  the  FMM  for  other  kernels  is  contained  in  Section  2.4.  A  Fast  Polynomial 
Interpolation  algorithm  utilizing  the  results  of  the  previous  sections  is  described  in 
Section  2.5,  and  finally,  in  Section  2.6  we  describe  how  this  algorithm  can  be  used  to 
construct  fast  algorithms  for  high  order  numerical  integration  and  differentiation. 
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2.1  Mathematical  and  Numerical  Preliminaries 

The  algorithms  of  this  chapter  are  based  on  several  results  from  the  Chebyshev  ap¬ 
proximation  theory  of  the  function  1/x.  This  analysis  is  presented  in  the  Lemmas 
and  Theorems  of  this  section,  numbered  2.1-2.10.  The  main  results  of  this  section 
fall  into  two  categories.  Theorems  2.5  and  2.7  describe  how  the  function  l/x  can 
be  approximated  on  different  regions  of  the  real  line  using  Chebyshev  expansions, 
and  Theorems  2.8,  2.9  and  2.10  provide  three  ways  of  manipulating  these  expansions 
which  are  needed  by  the  fast  tdgorithms  of  this  chapter. 

We  begin  with  three  classical  definitions  which  can  be  found,  for  example,  in  [14], 

[231. 

Definition  2.1  The  n-th  degree  Chebyshev  polynomial  T„{x)  is  defined  by  the  follow¬ 
ing  equivalent  formulae: 

Tn{x)  =  cos(n  arccos  x),  (2-5) 

r„(x)  =  (2.6) 

Definition  2.2  The  roots  {ti, . . .  ,t„}  of  the  n-th  degree  Chebyshev  polynomial  Tn  lie 
in  the  interval  [—1,1]  and  are  defined  by  the  formulae 


for  k  =  I, . . .  ,n.  They  are  referred  to  as  Chebyshev  nodes  of  order  n. 

Definition  2.3  We  will  define  the  polynomials  uj, . . .  ,u„  of  order  n  —  \  by  the  for¬ 
mulae 

=  n  7 — T 

fc=i 

for  j  =  1, . . .  ,n,  where  tk  are  defined  by  (2.7). 

The  order  n  —  \  Chebyshev  approximation  for  a  function  /  :  [— 1, 1]  — >  C  is  defined 
as  the  unique  polynomial  of  order  n  —  1  which  agrees  with  /  at  the  nodes  tj, . . . , 
There  exist  several  standard  representations  for  this  polynomial,  and  the  one  we  will 
use  in  this  chapter  is  given  by  the  expression 

7=1 


(2.9) 
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For  the  purposes  of  this  chapter,  Chebyshev  expansions  for  any  function  will  be 
characterized  by  values  of  this  function  tabulated  at  Chebyshev  nodes. 

Lemmas  2. 1-2.3  provide  estimates  involving  Chebyshev  expansions  which  are  used 
in  the  remaunder  of  this  section.  The  proof  of  Lemma  2.1  is  obvious  from  (2.5). 

Lemma  2.1  Let  T„(x)  be  the  Chebyshev  polynomial  of  degree  n.  Then, 

|r.(i)i  <  1  (2.10) 


for  any  x  €  [—1,1]. 

Lemma  2.2  Let  T„(i)  be  the  Chebyshev  polynomial  of  degree  n.  Then, 

bx 


ir.(x)|  >  i 

for  any  x  such  that  )x|  >  3. 

Proof.  From  Definition  2.1,  we  have 

IT’nCx)!  =  ^  •  |(x  +  +  (a:  - 


1 

>  2 


5x 


for  any  x  such  that  lx|  >3. 


(2.11) 


i  •  lx  +  -  (x/3)=r  =  5  ■  |x  ■  (1  +  >79)1“  (2.12) 


□ 


Lemma  2.3  Let  Uj{x)  be  defined  by  (2.8)  for  j  =  1, . . .  ,n.  Then,  for  any  x  €  [—1, 1], 

|wj(a:)|  <  1.  (2.13) 

Proo  It  is  obvious  from  (2.8)  that  Uj{tj)  =  1,  and  that  Uj{tk)  =  0  when  k  ^  j.  In 
addition,  some  algebraic  manipulation  reveals  that  the  expression 

-t,n(ti)Ux)  (2.14) 

”  k=i 

is  also  equal  to  1  when  x  =  tj  and  equal  to  0  when  x  =  tk  for  all  other  k.  Since  uj 
and  (2.14)  are  both  polynomials  of  order  n  —  1,  we  have 

”  1=1 


(2.15) 
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for  j  =  Furthermore,  due  to  the  combination  of  (2.15)  and  the  triangle 

inequality,  we  have 


\Uj{x)\  = 

for  any  x  €  [—1, 1]. 


”  k=\ 


i-E  •  in(x)i  <  1 

^  k=i 


(2.16) 

□ 


The  following  lemma  provides  an  identity  which  is  used  in  the  proof  of  Theo¬ 
rem  2.5. 

Lemma  2.4  Suppose  that  n  >  2,  and  that  b  >  0  and  xq  are  real  numbers  with 
liol  >  36.  Then,  for  all  x, 


x\  Tn{x/b) 


Tnixolb)- 

Proof.  Let  by  tbe  polynomial  of  degree  n  defined  by  the  formula 


(2,17) 


(2.18) 


Using  (2.8)  we  obtain 


Q{btk)  =  1  -  (btk  -  xo)  •  ^ 


j—l  btj  Xq 


Uj(tk) 


=  1  -  (btk  -  Xo) 


1 


=  0, 


btk  —  Xo 

for  k  =  1, . . . ,  n.  Clearly,  then,  Q{x)  satisfies  the  conditions 

Q(xo)  =  1 
Q(bh)  =  0 

Qibtn)  =  0. 


(2.19) 


(2.20) 


It  is  clear  that  the  function  Tn{xlb)/T„{  rQ/k)  is  also  a  polynomial  of  degree  n  which 
satisfies  the  n  -f  1  conditions  (2.20).  Therefore, 


and  (2.17)  follows  as  an  immediate  consequence  of  (2.18)  and  (2.21). 


(2.21) 

□ 
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Theorem  2.5  Suppose  that  n  >  2,  and  that  b  >  0  and  xq  are  real  numbers  with 
|a:o|  >  36.  Then, 


_1 _ ^  1  ^  /x\  ^  1 

X  —  xo  btj  —  Xo  V  6  /  *''6-5" 


(2.22) 


for  any  x  6  [—6,6]. 


Proof.  Due  to  Lemma  2.4,  we  have 


1 

\x-xo\  \T,^{xo/b)\' 


(2.23) 


Also,  due  to  Lemmas  2.1  and  2.2,  we  have 


for  any  x  €  [— 6, 6],  and 


|r„(x/6)l  <  1 


I'T  /  ILW  ^  ”  5” 

|7^a(xo/6)|  >  -  •  >  2 


(2.24) 


(2.25) 


for  any  jxol  >  36.  It  follows  from  the  combination  of  (2.23),  (2.24)  and  (2.25)  that 


-J _ <i.l<  J_ 

X  —  Xo  6tj  —  Xo  ^\bJ  26  5"  6-5" 


(2.26) 


for  any  x  €  [—6, 6]. 


The  following  lemma  provides  an  identity  which  is  used  in  the  proof  of  Theo¬ 
rem  2.7. 

Lemma  2.6  Suppose  that  n  >  2,  and  that  6  >  0  and  xq  are  real  numbers  with 
|xo|  <  6.  Then,  for  all 


c  /Ok  e-  \  ..  /  c\  —  ^n(0 

^  (36  ^Xo)  •  2^  OL  4  ~  ’  T  (Oh/  \  ■ 

j— I  36  tjXQ  Xq  Tn\ob/  XqJ 


(2.27) 


Proof.  Let  Q(0  fhe  polynomial  of  degree  n  defined  by  the  formula 


Q(0  =  f  -  (3t  -  «X„) .  •£  •  u,(0. 


(2.28) 
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Using  (2.8)  we  obtain 

Q{tk)  =  tk-  (36  -  tkXo)  ■  ^  - Uj{tk) 

=  tk-{3b-tkXo)-:r^^ - =0, 

•Su 

for  k  =  1, . . .  ,n.  Cleajly,  then,  Q{0  satisfies  the  conditions 

Q{3b/xo)  =  36/io 
Qih)  =  0 


Q{tn)  =  0. 


It  is  clear  that  the  function 


(2.29) 


(2.30) 


—  ^”(0  /'2  31) 

xq  TniZb/xo) 

is  also  a  polynomial  of  degree  n  which  satisfies  the  n  -f  1  conditions  (2.30).  Therefore, 


n(£)  =  — _ hfiQ. _ 

TniZb/xoV 

and  (2,27)  follows  as  an  immediate  consequence  of  (2.28)  and  (2.32). 


(2.32) 


Theorem  2.7  Suppose  that  n  >  2,  and  that  b  >  0  and  xq  are  real  numbers  with 
lxo|  <  b.  Then, 

I  1  ^  t,  /36\|  3 


1 _ ^  tj 

—  Xq  ^  36  —  tjXo  / 


for  any  x  such  that  |x|  >  36. 

Proof.  Writing  ^  =  36/i,  we  have  |4|  <  1  whenever  |x|  >  36,  and 


(2.33) 


X  —  xo  36/^  —  xo  36  —  ^xo 


Due  to  Lemma  2.6,  we  have 


^ _ ^  U  /f\  ^  ^  l^n(OI 

36 -^xo  ^36-tjXo  ■'  |36-^xo|  |xoI  |r„(36/xo)r 


(2.34) 


(2.35) 


12 


CHAPTER  2.  POLYNOMIAL  INTERPOLATION  AND  THE  EMM 


In  addition,  due  to  Lemmas  2.1  ajid  2.2  we  have 


36|r„(0|<36 


(2.36) 


for  (  €  [—1, 1],  and 


|io  ■  r„(36/io)l  >  ^ 


5-36 


3xq 


> 


5" -6 


(2.37) 


for  |xol  <  b.  Substituting  (2.36)  and  (2.37)  into  (2.35),  we  obtain 


3b  — ^xq  j^i3b  —  tjXo 


■ 


1  36-2 

<  XT 


26  5"  •  6  6-5" 


(2.38) 


for  ^  €  [—1, 1].  Now  it  follows  from  the  combination  of  (2.34)  and  (2.38)  that 


36' 
u,  I  — 


—  xq  ^  36  —  tjXo  \x 


< 


6-5” 


(2.39) 

□ 


The  following  three  theorems  provide  formulae  for  translating  along  the  real  axis 
Chebyshev  expansions  of  the  type  described  in  the  previous  two  theorems.  Theo¬ 
rem  2.8  provides  a  formula  for  translating  expeinsions  described  in  Theorems  2.5, 
Theorem  2.9  describes  a  mechcinism  of  converting  the  expansion  of  Theorem  2.5  to 
the  expansion  of  Theorem  2.7,  and  Theorem  2.10  provides  a  way  of  translating  the 
expansion  of  Theorem  2.7.  These  theorems  are  one-dimensional  counterparts  of  the 
three  theorems  in  [16]  which  provide  translation  operators  for  multipole  expansions 
in  the  complex  plane. 


Theorem  2.8  Suppose  that  n  >  2  and  let  c,d  be  a  pair  of  real  numbers  such  that 
[c  —  d,c  +  d\  C  [—1,  !]•  Then,  for  any  set  of  complex  numbers  ^i, . . . ,  o-nd  any 
X  €  [c-d,c-\-d\, 

■  Uj{x)  =  ^  •  uj{c  -t-  dtk))  ■  Uj  •  (2.40) 

i=i  k=i  \  a  / 


Proof.  To  prove  this  theorem  we  first  show  that 


n 


“j(2:)  =  H  +  dtk)  •  Wfc 

fe=i 


(2.41) 
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for  X  €  [c  —  c  -f  d\.  Indeed,  the  right  hand  side  of  (2.41)  is  simply  the  (n  —  l)-th 
degree  Lagrange  interpolating  polynomial  for  the  function  Uj(i)  at  the  nodes  c  + 
dti, . . .  ,c  +  din-  However,  Uj(x)  itself  is  a  polynomiad  of  degree  n  —  1,  and  is  therefore 
equal  to  its  Lagrange  interpolant  of  order  n  —  1. 

The  formula  (2.40)  then  follows  as  an  immediate  consequence  of  (2.41).  □ 


Theorem  2.9  Suppose  that  n,N  >  2  and  let  c,d  be  a  pair  of  real  numbers  such  that 
|c|  —  d  >  3.  Let  the  function  /  ;  R  ^  C  be  defined  by  the  formula 


N 

/(^) = iz 


k=l 


Ctk 

X  Xjf 


(2.42) 


where  Xk  €  [—1,1]  for  k  =  l,...,iV,  and  oi,...,a;v  is  a  set  of  complex  numbers. 
Further,  let  ti, . . . ,  be  Chebyshev  nodes  defined  by  (2.7),  /ei  . . . ,  be  a  set  of 
complex  numbers  defined  by  the  formula 

f  (I) 

for  k  =  1, . . . ,  n,  and  /e<  . . . ,  be  a  set  of  complex  numbers  defined  by  the  formula 


j=l  ^ 

Then,  for  any  x  ^[c  —  d,cA  d\, 


cAdtk^ 


where  A  =  Y,k=i  l«fcl- 

Proof.  Due  to  the  triangle  inequality. 


<  A 


3n  +  1 


<  Si  +  S2, 


where 


and 


Si  = 


5,= 


f{^)  -  ^  /(c  +  dtk)  ■  U, 
^(/(c  +  dtk)  -  fc)  •  Uj 


(2.44) 


(2.45) 


(2.46) 


(2.47) 

(2.48) 
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Combining  Theorem  2.5  with  the  triangle  inequcility  we  obtain 


5-5"’ 


(2.49) 


and  from  the  combination  of  Theorem  2.7,  Lemma  2.3  and  the  triangle  inequality,  we 
obtain 


^2  <  ^  /(c  +  dtk)  -  ^  •  Uj  ( 

k=l  j=l 


c-\-dtkJ\  6  •  5"  ’ 


(2.50) 


where  A  =  Finally,  substituting  (2.49)  and  (2.50)  into  (2.46)  we  have 


^  /X  —  c\  3n  +  1 


(2.51) 


for  any  x  €  [c  —  d,  c  +  d]. 


Theorem  2.10  Suppose  that  n,N  >  2  and  let  c,d  be  a  pair  of  real  numbers  such 
that  [c  —  d,  c  +  d]  D  [—1, 1].  Let  the  function  /  :  R  — »  C  be  defined  by  the  formula 


f(-)  =  E  7^ 

k-i  X  ^k 


(2.52) 


where  Xk  €  [—1,1]  for  k  =  1,...,A^,  end  ai,...,ajv  is  a  set  of  complex  numbers. 
Further,  let  ti, . . . ,  be  Chebyshev  nodes  defined  by  (2.7),  /ef  $i, . . . ,  be  a  set  of 
complex  numbers  defined  by  the  formula 


(2.53) 


for  k  —  1, . . .  ,n,  and  /et  $i, . . . ,  $„  be  a  set  of  complex  numbers  defined  by  the  formula 


=  E  •  “i  ( 

j=i  ^ 


tk  ^ 
Zd  +  ctk^ 


(2.54) 


Then,  for  any  x  such  that  |x  —  c|  >  3d, 


3(ra  +  1) 
6-5”  ’ 


(2.55) 


where  A  =  Ylk=i 
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Proof.  It  follows  from  the  triangle  inequality  that 

*=1 

where 

Combining  Theorem  2.7  with  the  triangle  inequality,  we  obt2hn 


6-5'*’ 


(2.56) 


(2.57) 


(2.58) 


(2.59) 


and  from  the  combination  of  Theorem  2.7,  Lemma  2.3  and  the  triangle  inequality,  we 
obtain 


52  <  ^  /  I  c  + 


3d  +  c<fc/  6-5”’ 


(2.60) 


where  A  =  Finally,  substituting  (2.59)  and  (2.60)  into  (2.56)  we  have 


r.  ^  /  3d  \  ,  3(n  +  l 

j  <A--^ 


(2.61) 


for  any  x  such  that  |x  —  c|  >  3d. 


2.2  The  Fast  Multipole  Method  in  One  Dimen¬ 
sion 


In  this  section  we  consider  the  problem  of  computing  the  sums 


f=y^ 

'  yj  - 


(2.62) 


for  y  =  1, . . . ,  A^,  where  {xj, . . .  ,x/v}  and  {j/i, . . . ,  j/at}  are  sets  of  real  numbers,  and 
{aj,...,aAr}  and  {fi,  ■ . . ,  fjv}  are  sets  of  complex  numbers.  This  problem  can  be 
viewed  as  the  special  case  of  the  A^-body  problem  in  physics  where  we  wish  to  evaluate 
the  electrostatic  field  due  to  N  charges  which  lie  on  a  straight  line  at  a  set  of  points 
on  this  line. 
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Remark  2.3  For  the  remainder  of  this  chapter,  we  shall  assume  without  loss  of 
generality  that  li,  j/i  €  [—1,1]  for  i  =  1, . . . , 


Remark  2.4  The  fast  multipole  algorithm  of  Greengard  and  Rokhlin  [16]  computes 
sums  of  a  more  general  form  than  (2.62)  in  0{N)  arithmetic  operations.  This  more 
general  form  is  described  by  the  formulae 


N 


f,  =  T. 

k=l 


Wj  -  Zk 


(2.63) 


for  j  =  1, . . . ,  A^,  where  {^i, . . . ,  zjv}  and  {rui, . . . ,  iuat)  are  sets  of  complex  numbers. 
From  a  physical  viewpoint,  this  corresponds  to  the  evaluation  of  the  electrostatic 
field  due  to  N  charges  which  lie  in  the  plane.  While  the  two  and  three  dimensional 
scenarios  for  the  A/^-body  problem  have  been  discussed  in  some  detail  (see,  for  example, 
[7],  [16]),  the  analysis  and  applications  of  one  dimensional  problems  appear  to  have 
been  largely  overlooked,  with  one  exception  of  which  we  are  aware:  the  application  of 
FMM  techniques  to  various  problems  in  numerical  linear  algebra,  by  Gu  and  Eisenstat 
(see  [17],  [18]). 

In  this  section  we  present  an  0{N)  algorithm  for  the  one-dimensional  problem 
which  is  based  on  the  two-dimensional  FMM,  but  incorporates  a  number  of  modifi¬ 
cations  which  accelerate  the  scheme  significantly.  We  summarize  these  modifications 
below,  with  the  assumption  that  the  reader  is  familiax  with  [16]. 

1.  Replacement  of  all  complex  by  complex  multiplications  with  real  by  complex 
multiplications. 

2.  Replacement  of  multipole  expansions  with  Chebyshev  expansions.  Chebyshev 
series  are  known  to  converge  more  rapidly  than  multipole  expansions  (which 
are  actually  Taylor  series). 

3.  Further  compression  of  the  Chebyshev  expansions  by  a  suitable  change  of  basis 
(see  Section  2.2.4).  Using  this  technique,  the  function  1/x  can  be  accurately 
represented  by  about  a  quarter  of  the  number  of  coefficients  which  were  required 
by  the  original  two-dimensional  FMM. 

4.  In  one  dimension,  each  subinterval  hais  3  other  subintervals  in  its  interaction 
list,  whereas  in  two  dimensions  each  box  has  27  other  boxes  in  its  interaction 
list. 

5.  In  one  dimension,  each  subinterval  heis  2  nearest  neighbor  subintervals,  whereais 
in  two  dimensions  each  box  has  8  nearest  neighbor  boxes. 
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>  2r 


11X2  •  •  •  Ijv 


xq  -  r  Xq  xo  +  X 


•  •  •  Vm 

— K- — - -h- 

Vo-r  yo  yo  +  r 


Figure  2.1:  Well- separated  intervals  on  the  line. 

This  section  is  divided  into  four  parts.  Sections  2.2. 1-2. 2. 3  are  devoted  to  an 
algorithm  for  the  FMM  in  one  dimension  which  uses  Chebyshev  expansions  in  place 
of  multipole  expansions.  In  Section  2.2.4,  we  describe  a  more  efficient  algorithm  which 
is  based  on  the  algorithm  of  Section  2.2.3  but  uses  a  Singular  Value  Decomposition 
to  further  compress  the  Chebyshev  expansions. 

2.2.1  General  Strategy 

We  will  illustrate  by  means  of  a  simple  example  how  Chebyshev  expansions  can  be 
used  to  evaluate  expressions  of  the  form  (2.62)  more  efficiently.  We  will  also  give 
an  informal  description  of  how  the  method  of  this  simple  example  is  used  in  the 
construction  of  a  fast  algorithm  for  the  general  case. 

First  we  introduce  a  definition  which  formalizes  the  notion  of  well-separated  in¬ 
tervals  on  the  real  line.  This  is  simply  the  one-dimensional  analog  of  the  definition 
of  well-separatedness  in  [16]. 

Definition  2,4  Let  {ii,...,xyv}  and  {yi,. .  ■  ,yM^  two  sets  of  points  in  R.  We 
say  that  the  sets  {i,}  and  {j/,}  are  well-separated  if  there  exist  points  io,J/o  €  R  and 
a  real  r  >  0  such  that 


|ii  —  Xo|  <  r  V  i  =  1,. . .  ,N 

ly.  -  yol  <  T"  V  i  =  1,. . .  ,M  and  (2.64) 

|xo  -  yo|  >  4r. 


Suppose  now  that  {xi, ....  x^v}  and  {yi, .  -  - ,  yxr}  are  well-separated  sets  of  points 
in  R  (see  Figure  2.1),  that  {Qi,...,aAr}  is  a  set  of  complex  numbers,  and  that  we 
wish  to  compute  the  numbers  /(j/i),  •  •  • , /(j/m)  where  the  function  /  :  R  — >  C  is 
defined  by  the  formula 


N 

=  L 


jt=i 


(2.65) 


A  direct  evaluation  of  (2.65)  at  the  points  {t/i, . . . ,  yw}  requires  0{NM)  arithmetic 
operations.  We  will  describe  two  different  ways  of  speeding  up  this  computation. 
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Following  is  the  first  of  these  approaches. 

Let  the  function  /]  :  R  — »  C  be  defined  by  the  formula 


N 


/l(^)  =  W  _  >1 

kzz\  j=l  iji^k  ^o) 


(2.66) 


where  p  is  an  integer  and  tj, . . .  ,tp  and  ui, . . .  ,Up  are  given  by  Definitions  2.2  and 
2.3. 


Observation  2.5  From  the  combination  of  (2.65),  (2.66),  Theorem  2.1  and  the  tri¬ 
angle  inequality,  we  see  that 

|/(I)  -  /.(x)|  <  (2.67) 

for  any  x  such  that  |x  —  io|  >  3r. 


Now  let  {$1, . . . ,  $p}  be  a  set  of  complex  numbers  defined  by  the  formulae 


for  j  =  1, . . .  ,p.  Then, 


$ .  =  V  - - 

Zr-tj{xk~xo) 


(2.68) 


(2.69) 


Remark  2.6  The  vector  $  will  be  referred  to  as  the  far-field  expansion  for  the  in¬ 
terval  [lo  —  r,  lo  +  r]. 


Computation  of  the  coefficients  requires  0{Np)  operations,  and  a  subsequent 
evaluation  of  fi{yi), . . . ,  /i(yAf)  is  an  0{Mp)  procedure.  The  total  computational  cost 
of  approximating  (2.65)  to  a  relative  precision  1/5’’  is  then  0(7Vp  -f-  Mp)  operations. 
An  alternative  way  of  speeding  up  this  computation  is  described  below. 

Let  the  function  /2  :  R  — >  C  be  defined  by  the  formula 


N 


1 


/2(a:)  =  — 

-  J/o) 


(2.70) 


where  p  is  an  integer  and  ti,...,tp  and  ui,.  ..,Up  are  given  by  Definitions  2.2  and 
2.3. 
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Observation  2.7  From  the  combination  of  (2.65),  (2.W),  Theorem  2.5  and  the  tri¬ 
angle  inequality,  we  see  that 


!/(*)  -  < 


i:Li  loti 

r  ■  5^ 


for  any  x  such  that  |a;  —  yol  < 


(2.71) 


Now  let  {$1, . . . ,  be  a  set  of  complex  numbers  defined  by  the  formulae 

1 


N 


for  j  =  1, . . . ,  p.  Then, 


^  =  V  ajt  •  —  , 

rtj  -{xk-  xo) 


A(x)  =  t  .  u,  (^) 


(2.72) 


(2.73) 


Remark  2.8  The  vector  ^  will  be  referred  to  as  the  local  expansion  for  the  interval 
[l/o  -r,yo  +  r]. 


Computation  of  the  coefficients  requires  0{Np)  operations,  and  a  subsequent 
evaluation  of  /2(yi ),.-., /2(yii/)  is  an  0{Mp)  procedure.  Ageiin  the  toted  compu¬ 
tational  cost  of  approximating  (2.65)  to  a  relative  precision  1/5’’  is  0{Np  -f  Mp) 
operations. 

Consider  now  the  general  case,  where  the  points  {xi,...,x;v}  and  {yi,  ■  ■ .  ,yM} 
are  arbitrarily  distributed  on  the  interval  (—1, 1]  (see  Remark  2.3).  We  use  a  hierar¬ 
chy  of  grids  to  subdivide  the  computational  domain  [—1,1]  into  progressively  smaller 
subintervals,  and  to  subdivide  the  sets  {x,}  and  {y,}  according  to  subinterval  (see 
Figure  2.2).  A  tree  structure  is  imposed  on  this  hierarchy,  so  that  the  two  subin¬ 
tervals  resulting  from  the  bisection  of  a  larger  (parent)  interval  are  referred  to  as 
its  children.  Two  Chebyshev  expansions  are  associated  with  each  subinterval:  a  far- 
field  expansion  for  the  points  within  the  subinterval,  and  a  local  expansion  for  the 
points  which  are  well- separated  from  the  subinterval.  Interactions  between  pairs  of 
well-separated  subintervals  can  be  computed  via  these  Chebyshev  expansions  in  the 
manner  described  above,  and  all  other  interactions  at  the  finest  level  can  be  computed 
directly.  Once  the  precision  has  been  fixed,  the  computational  cost  of  the  entire  pro¬ 
cedure  is  0{N)  operations  (for  a  detailed  description  and  complexity  analysis,  see 
Section  2.2.3). 
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level 

0 

1 

2 

3 


2.2.2  Notation 

In  this  section  we  introduce  the  notation  to  be  used  in  the  description  of  the  algorithm 
below. 

•  p  will  be  an  integer  denoting  the  size  of  Chebyshev  expansions  used  by  the 
algorithm.  Normally,  p  —  f— log5(£)],  where  e  is  the  desired  precision  of  com¬ 
putations. 

•  ti,..  .,tp  will  denote  Chebyshev  nodes  of  order  p  on  the  interval  [—1, 1],  defined 
by  the  formulae 

f2i  —  l  t\ 

u  =  cos  .  -  j  (2.74) 

for  t  =  1 , . . . ,  p. 

•  Ui(f), . . . ,  itp(t)  will  denote  the  set  of  polynomials  defined  by  the  formulae 

“.w=n^.  (2-75) 

k=i 

for;  =  l,...,p. 

•  s  will  be  an  integer  denoting  the  number  of  points  in  each  subinterval  at  the 
finest  level  of  subdivision.  Normally,  s  ss  2p  (see  Remark  2.9). 

•  nlevs  =  [log2(iV/5)]  will  denote  the  level  of  finest  subdivision  of  the  interval 
[-1,1]- 

•  will  be  a  p-vector  denoting  the  far-field  expansion  for  the  subinterval  i  at 
level  1. 
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•  will  be  a  p-vector  denoting  the  local  expansion  for  the  subinterval  i  at  level 

1. 

•  Ml  and  Mr  will  be  p  x  p  matrices  for  obtaining  fax-field  expansions  for  subinter¬ 
vals  in  terms  of  the  far-field  expansions  of  their  children.  These  will  be  defined 
by  the  formulae 


(2.76) 

which  are  obtained  from  the  formula  (2.54)  of  Theorem  2.10  applied  to  the  cases 
c  =  —l,d  =  2  and  c  =  l,d  =  2. 

•  Sl  and  Sr  will  be  p  x  p  matrices  for  obtaining  local  expansions  for  subintervals 
in  terms  of  the  local  expansions  of  their  parent.  These  will  be  defined  by  the 
formulae 


Sdhj)  =  “)  ’ 

which  are  obtained  from  the  formula  (2.40)  of  Theorem  2.8  applied  to  the  cases 
c  =  —  1  and  c  —  \,d= 

•  Ti,  72,  Tz  and  74  will  be  p  x  p  matrices  for  obtaining  local  expansions  from 
far-held  expansions.  These  will  be  dehned  by  the  formulae 


Ti{i,j)  =  Uj  1 

r  3  \ 

U,  -6/ 

^2(t,j)  =  Uj  1 

7  3  ^ 
Ui-4/ 

Tz{i,3)  =  y-j  ( 

r  3  N 
Kti  +  4/ 

T4{i,3)  =  Uj  1 

f  3  \ 
Kti  +  6/ 

(2.78) 


which  axe  obtained  from  the  formula  (2.44)  of  Theorem  2.9  applied  to  the  cases 
c  =  — 6,d  =  1,  c  =  — 4,d  =  1,  c  =  4,d  =  1  and  c  =  6,d  =  1. 
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2.2,3  Description  of  the  Algorithm 

Following  is  a  formal  description  of  an  algorithm  for  the  efficient  evaluation  of  ex¬ 
pressions  of  the  form  (2.62). 


Algorithm  2.1 

Step  Complexity  Description 

1  0(1)  Comment  [Input  problem  size,  N,  and  a  real  number  e  >  0.] 

Set  size  of  Chebyshev  expansions  p  =  [-log5(£)],  choose  s  and  set 
level  of  refinement  nlevs  =  (log2(A7*)l. 

2  0{Np)  Comment  [Determine  far-field  expansions  for  subintervals  at  finest 

level.] 

do  i  = 

Compute  p-term  far-field  expansion  $n/eus,t  due  to  the  subset  of  {xj} 
which  lie  in  subinterval  i  at  level  nlevs. 
enddo 


3  0{2Np‘^/s)  Comment  [Determine  p-term  far-field  expansions  for  each  subinter¬ 

val  at  every  level  by  shifting  and  adding  far-field  expansions  of  the 
subinterval’s  children.] 

do  I  =  nlevs  - 
do  t  =  1, . . .,2^ 

=  Ml  •  +  Mfi  ■ 

enddo 

enddo 


4  0{SNp^ /s)  Comment  [Determine  p-term  local  expansions  for  each  subinterval 

at  every  level  by  first  shifting  local  expansion  of  the  subinterval’s 
parent,  and  then  adding  the  interactions  with  the  three  subintervals 
which  are  well- separated  from  the  subinterval,  but  which  have  not 
been  accounted  for  at  the  parent’s  level.] 

do  /  =  1. . . nlevs  —  1 
do  i  -  1,  ..,2‘ 

^l+x,2i-l  =  Sl  •  +  Ti  •  $J+i,2t-3  +  Ts  •  $;+i,2i+l  +  T4  •  $/+i,2.+2 

^/-(-1,2»  =  Sr  •  -I-  Ti  •  $/+i,2i-3  +  T2  •  $/+l,2t-2  +  T4  •  $/+i,2t-(-2 

enddo 

enddo 


2.2.  THE  FAST  MULTIPOLE  METHOD  IN  ONE  DIMENSION 


23 


5  0{Np)  Comment  [Evaluate  local  expansions  for  subintervals  at  finest  level.] 

do  i  = 

Evaluate  p-term  local  expansions  9niev3,i  at  the  subset  of  {j/j}  which  lie 
in  subinterval  i  at  level  nlevs. 
enddo 

6  0(3iVs)  Comment  [Add  nearest  neighbour  interactions  which  have  not  yet 

been  accounted  for  via  expansions.] 

do  t= 

For  each  yt  in  subinterval  t  at  level  nlevs,  compute  interactions  with  all 
Xj  in  subintervals  i  —  1,  i,  i  +  1,  and  add  to  well-separated  values, 
enddo 

Total  0(N  •  (2p  -f  lOp^/s  -)-  3s)) 


Remark  2.9  The  number  s  can  be  chosen  to  minimize  the  total  operation  count  of 
the  algorithm,  which  yields  s  %  2p.  The  above  algorithm  then  requires  order 


arithmetic  operations. 


13-7V-P 


(2.79) 


Remark  2.10  The  operation  count  for  Step  6  assumes  that  the  points  {xi, . . . ,  xn} 
are  reasonably  uniformly  distributed.  Highly  nonuniform  distributions  are  discussed 
in  Section  2.3. 


2.2.4  A  More  Efficient  Algorithm 

Chebyshev  expansions  are  not  the  most  efficient  means  of  representing  interactions 
between  well-separated  intervals.  All  the  matrix  operators  of  the  algorithm  are  nu¬ 
merically  rank-deficient,  and  can  be  further  compressed  by  a  suitable  change  of  basis. 
The  orthogonal  matrices  required  for  this  basis  change  are  obtained  via  singuleo'  value 
decompositions  (SVDs)  of  appropriate  matrices. 

In  this  subsection  we  describe  a  more  efficient  version  of  the  one  dimensional  FMM 
which  is  based  upon  this  observation.  The  algorithm  is  very  similar  to  Algorithm  2.1 
with  one  important  modification:  separate  changes  of  basis  are  needed  for  interactions 
from  the  left  and  interactions  from  the  right,  so  for  every  subinterval  we  maintain 
two  sets  of  expansions,  leftward  expansions  and  rightward  expansions. 

We  will  require  some  additional  notation  for  the  algorithm  description  of  this 
section.  The  following  denotations  use  the  notation  of  Section  2.2.2. 
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p  will  be  an  integer  denoting  the  numerical  r«ink  of  the  operators  to  be  com¬ 
pressed. 

will  denote  the  ji-th  Chebyshev  node  of  order  p  on  the  interval  [0, 1],  and 
will  denote  the  j-th  Chebyshev  node  of  order  p  on  the  interval  [—1,0]. 

Ul  and  14  will  denote  pxp  matrices,  each  of  whose  columns  forms  an  orthonor¬ 
mal  set,  and  £  will  denote  a.  p  x  p  diagonal  matrix  such  that 

\\UlY.VI -Ml\\<c  \\Mi\\,  (2.80) 


where 

for  =  l,...,p.  Ul  and  Vi  are  used  to  compress  leftward  expansions,  and 
is  effectively  the  numerical  SVD  of  M-l- 

Ur  and  Vr  will  denote  pxp  matrices,  each  of  whose  columns  forms  an  orthonor¬ 
mal  set,  such  that 

||(/rSV/  -  MkW  <  e  ■  IIAIrH,  (2.82) 

where 

for  i,  j  =  1, . . .  ,p.  An  examination  of  the  matrices  Ml  and  Mr  reveals  that 
Ur  and  Ul  are  closely  related,  and  Vr  and  14  are  closely  related.  Ur  and  14  are 
used  to  compress  rightward  expansions,  and  UrEV^  is  effectively  the  numerical 
SVD  of  Mr. 


and  will  be  p-vectors  denoting  respectively  the  left  and  right  far-held 
expansions  for  subinterval  i  at  level  /.  These  will  be  dehned  by  the  formulae: 


(2.84) 


and  will  be  p-vectors  denoting  respectively  the  left  and  right  local 
expansions  for  subinterval  i  at  level  /.  These  will  be  dehned  by  the  formulae: 
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•  Mf ,  M^,  Ml  and  M^  will  be  p  x  p  matrices  for  obtaining  left  and  right  f«u‘- field 
expansions  for  subintervals  in  terms  of  the  left  and  right  far-field  expansions  of 
their  children.  These  will  be  defined  by  the  formulae 


Mf 

=  Ul 

Ml- 

Ul, 

Mk 

=  Ul 

■Mr 

■Ul, 

Mf 

=  Ul 

Ml- 

Ur, 

Mk 

=  ul 

■Mr 

■Ur. 

•  ^Li  ^R  will  be  p  X  p  matrices  for  obtaining  left  and  right  local 

expansions  for  subintervals  in  terms  of  the  left  and  right  local  expainsions  of 
their  parent.  These  will  be  defined  by  the  formulae 

St  =  Vl-Si-Vi, 

Sk  =  V^-Sn-Vi, 

St  =  V^-Si-Vn,  (2.87) 

Sk  =  V^-Sr-Vr. 

•  T^  «ind  T/'  will  be  p  x  p  matrices  for  obtaining  left  local  expansions  from  right 
far-field  expansions.  These  will  be  defined  by  the  formulae 

Tt  =  VI-Ts-Ul, 

T^  =  Vl  ■  T,  •  Ul.  (2.88) 

•  T/*  and  Tt  will  be  p  x  p  matrices  denoting  for  obtaining  right  local  expansions 
from  left  far-field  expansions.  These  will  be  defined  by  the  form  lae 

Tf  = 

T?  =  Vi-T.UR.  (2.89) 

Following  is  a  step-by-step  description  of  a  modified  version  of  Algorithm  2.1. 
Algorithm  2.2 

Step  Complexity  Description 

1  0(1)  Comment  [Input  problem  size,  iV,  and  a  real  number  e  >  0.] 

Set  size  of  Chebyshev  expansions  p  =  [-log5(£:)],  choose  p,  choose  s 
and  set  level  of  refinement  nlevs  =  p[og2(A/s)]. 
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2  0{2Np)  do  i 

Form  a  ^term  left  far-field  expansion  j- 

Form  a  ^term  right  far-field  expansion  ^nUvs.f 
enddo 

3  0{4Np‘^/s)  do  I  =  nlevs  —  I,. I 

do  i  =  1,.  ..,2' 

♦fc  =  Mt  ■  2, 

+  Mg  ■ 

enddo 

enddo 

4  O(10iVp^/s)  do  /  =  1, . .  .,n/et;s  —  1 

do  i  = 

.2i+l  +  '  ^f+l,2i+2 

Ki.2.  + 

=  Sl  ■  ffc  +  Tf  ■ 

®tn.2,  =  Sg  ■  »f,  +  rf  .  ,,2..2  +  T'’  ■  ♦« 

enddo 

enddo 

5  0(2Np)  do  i  =  1,...,2'*'«''* 

Evaluate  and  add  left  and  right  ^term  local  expansions 
enddo 

6  0(3iVs)  do  i  = 

For  each  point  in  subinterval  i  at  level  nlevs,  compute 
interactions  with  all  other  points  in  subintervals  t  -  1, 

and  add  to  far-field  values, 
enddo 

Total  0(N  ■  (4p  +  14p^/s  -|-  3s)) 


Remark  2.11  We  can  choose  s  to  minimize  the  total  operation  count  of  the  algo¬ 
rithm,  which  yields  s  ~  2p.  The  above  algorithm  then  requires  order 

ITAT.p  (2.90) 


arithmetic  operations. 


Remark  2.12  The  results  of  our  numerical  experiments  indicate  that,  for  a  fixed 
precision  e  and  corresponding  p,  p  (the  numerical  rank  of  the  matrix  operators  to 
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be  compressed)  is  approximately  p/2.  This  condition  together  with  (2.79)  and  (2.90) 
leads  us  to  expect  that  Algorithm  2.2  will  require  about  two-thirds  of  the  number  of 
arithmetic  operations  needed  by  Algorithm  2.1. 

Remark  2.13  The  operation  count  for  Step  5  assumes  that  the  points  {xi, . . .  ,Xjv} 
axe  reasonably  uniformly  distributed.  An  adaptive  version  of  this  cdgorithm  is  de¬ 
scribed  in  Section  2.3,  and  is  capable  of  handling  highly  nonuniform  distributions 
while  preserving  computational  efficiency  £ind  accuracy. 


2.3  The  Adaptive  FMM  in  One  Dimension 

The  algorithms  of  the  previous  section  have  one  drawback:  their  operation  count  is 
quite  sensitive  to  the  distribution  of  points,  and  they  become  inefficient  for  highly 
noDuniform  distributions.  We  now  describe  an  adaptive  version  of  Algorithm  2.2 
which  overcomes  this  deficiency,  its  complexity  being  0(N)  independently  of  the 
spacing  of  the  nodes.  This  versatility  is  achieved  by  using  different  levels  of  subdivision 
for  different  parts  of  the  computational  domain.  An  integer  s  is  fixed,  and  at  each 
level  of  refinement,  we  subdivide  only  those  intervals  which  contain  more  than  s 
points.  At  each  level,  then,  a  list  of  non-empty  subintervals  is  maintained  whose 
members  are  the  result  of  the  selective  subdivision  of  intervals  at  the  previous  level. 
This  policy  eliminates  the  inefficiency  of  the  non-adaptive  version,  where,  at  the  finest 
level  of  subdivision,  we  may  encounter  intervals  with  very  few  or  very  many  points. 
In  the  non-adaptive  version,  each  interval  has  two  nearest  neighbors  of  the  same  size, 
whereas  in  the  adaptive  version,  intervals  are  permitted  to  have  neighbors  of  differing 
size.  Figure  2.3  depicts  a  subdivision  of  the  computational  domain  for  a  nonuniform 
distribution  of  points. 

Remark  2.14  The  idea  of  selectively  subdividing  the  computational  domain  is  taken 
from  the  two  dimensional  adaptive  FMM  of  [7].  This  algorithm  requires  a  somewhat 
more  elaborate  data  structure  than  its  non-adaptive  counterpart  to  account  for  in¬ 
teractions  between  all  the  different  sized  boxes.  The  adaptive  algorithm  for  problems 
in  one  dimension  also  needs  additional  bookkeeping  to  keep  track  of  interactions  be¬ 
tween  all  the  different  sized  subintervals.  However,  the  simplified  geometry  of  the  real 
line  suggests  the  use  of  a  somewhat  more  efficient  data  structure  than  that  of  the  two 
dimensional  case  (see  Figure  2.3). 

Remark  2.15  It  is  clear  that  for  a  fixed  machine  precision,  £,  only  certain  distribu¬ 
tions  of  points  will  yield  meaningful  results.  For  example,  the  points  Xi  and  X2  are 
indistinguishable  if  |x2  —  Xi|  <  |  •  |xi  -i-  X2I.  To  avoid  such  cases  we  will  impose  that 
the  minimum  distance  between  two  points  must  be  greater  than  e-{h  —  a)  where  [a,  6] 
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Level 


Subinterval  Number 


0 

1 

2 

3 

4 


I - 2 - ^ 

I  7  i-S-H 


1 - 5 - ^ - S - 

I  9  I  10  I  11  I  12 
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Figure  2.3:  Hierarchy  of  subintervals  for  nonuniform  distribution. 
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is  the  computational  domain.  Under  this  condition,  the  highest  level  of  refinement  of 
the  computational  domain  is  bounded  above  by  |log2(£)|. 

2.3.1  Notation 

We  require  some  additional  notation  for  the  description  of  the  adaptive  algorithm  to 
supplement  that  of  Sections  2.2.2  and  2.2.4. 

•  nlevs  will  denote  the  level  of  finest  subdivision  of  any  part  of  the  interval  [—1,1]. 

•  For  a  fixed  precision,  £,  m  =  llog2(£)|  will  denote  the  maximum  level  of  refine¬ 
ment  of  the  interval  [—1, 1]  (see  Remark  2.15). 

•  Ii  will  denote  the  set  of  non-empty  subintervals  at  level  /,  i.e.  the  set  of  subin¬ 
tervals  at  level  I  resulting  from  the  bisection  of  a  larger  interval  at  level  /  —  1. 

•  If  subinterval  isub  contains  more  than  s  points,  it  is  called  a  parent  subinterval, 
and  UchUd{isub)  and  irchild(isub)  will  denote  its  left  child  and  its  right  child 
which  are  the  subintervals  resulting  from  its  bisection.  Otherwise,  isub  is  a 
called  a  childless  subinterval  and  ilchHd{isub)  and  irchild{isub)  are  set  to  0. 

•  ilnbr(isub)  and  irnbr{isub)  will  denote  the  left  and  right  neighbors  for  subin¬ 
terval  isub,  which  are  the  smallest  adjacent  subintervaJs  at  the  same  level  of 
refinement  or  a  coarser  one. 

•  and  will  be  p-vectors  denoting  respectively  the  left  and  right  far-field 

expansions  for  subinterval  i. 

•  and  will  be  ^vectors  denoting  respectively  the  left  and  right  local  ex¬ 

pansions  for  subinterval  i. 

2.3.2  Description  of  the  Algorithm 

This  section  contains  a  detailed  description  and  a  complexity  analysis  of  an  adaptive 
version  of  Algorithm  2.2. 

Algorithm  2.3 

Initialization  Step 

Comment  [Geometrical  preprocessing] 

Comment  [Input  problem  size,  N,  and  a  real  number  e  >  0.] 

Set  size  of  Chebyshev  expansions  p  =  [’-log5(£)],  choose  p,  choose  s  and  set  maximum 
level  of  refinement  m  =  [log2(iV/s)]. 
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/1  =  {[-1,0],[0,1]}_ 
do  /  =  1, . . . ,  m  while  Ii  is  non-empty 
do  isub  €  Ii 

if  isub  contains  more  thaji  s  points  then 

Add  ilchild{isub)  and  iTchild{isub)  to  It+i- 
else 

ilchild{isub)  =  0 
irchild{isub)  =  0 
endif 
enddo 
enddo 
nlevs  =  I 

do  /  =  1, . . nlevs 
do  isub  €  Ii 

if  isub  is  not  childless  then 

ilnbr{irchild{isub))  =  ilchild(isub) 
irnbr{Hchild{isub))  =  irchUd(isub) 
if  ilnbr{isub)  is  not  childless  then 

ilnbr(ilchild{isub))  =  irchild{ilnbr(isub)) 
else 

Unbr{ilchild{isub))  =  ilnbrfisub) 

endif 

if  irnbr{isub)  is  not  childless  then 

irnbr(irchild{isub))  =  ilchild{irnbr{isub)) 
else 

irnbr{irchild{isub))  =  irnbr{isub) 
endif 
endif 
enddo 
enddo 

Step  1 

Comment  [Upward  Pass] 

do  /  =  nlevs, . . . ,  1 
do  isub  6  Ii 

if  isub  is  a  childless  subintervaJ  then 
Form  a  ^term  far-field  expansion 
Form  a  ^term  far-field  expansion 
else 

^isu6  ~  ^ ilchild{isub)  ^ irchild(isub) 

^isub  ~  ^ UchUd(istib)  ^ irchild{isub) 

endif 
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enddo 

enddo 


Step  2 

Comment  [Downward  Pass] 


do  /  =  1, . .  .,nlevs 
do  isub  €  Ii 

if  isub  is  a  childless  subinterval  then 

Evaluate  and  add  ^term  local  expansions  +  '^^sub 
else 


HiL  _  cR  .  ij/L 

^  ilchild{isub)  ^  isub 

xifL  —  cR  ,  \TfL 

^  iTchild(isub)  ^R  ^  isub 

\uR  —  cR  .  \uR 

^  ilchild{isub)  *«iu6 

=  cH  .  mR 

^  iTchild(isub)  ^R  ^  isub 


if  ilnbr(isub)  is  a  childless  subinterval  then 

Add  contribution  of  ^irchiid{isub)  point  in  ilnbr{isub) 

Add  contribution  of  each  point  in  ilnbr{isub)  to  '^^rchild{isub) 
else 


^  itchild{isub) 


^ ilchild{isub)  ^ ilchild{ilnbr{isub)) 


^ iTchild{isub)  ^ iTchild(isub)  ^ iTchild{ilnbT(isub))  ^2  ^ ilchUd{ilnbr{isub)) 

endif 

if  irnbr{i$ub)  is  a  childless  subinterval  then 

Add  contribution  of  ^uchiid(isub)  point  in  i‘^nbr{isub) 

Add  contribution  of  each  point  in  irnbr{isub)  to  ^^chiid(isub) 
else 

^ ilchild(isub)  ^ ilchild{isub)  ^ ilchild{iTnbT{isub))  ^2  ^ irchild{irnbr{isub)) 

mR  —  I  qpR  . 

^  irckild(isub)  ^  iTchild(isuh}  '  1  ^irchild{irnbT{isub)) 

endif 

endif 


enddo 

enddo 


Step  3 

Comment  [Direct  Interactions] 

do  isub  =  1, . .  .,nsub 

if  isub  is  childless  then 

For  each  point  in  subinterval  isub,  compute  interactions  with 
all  other  points  in  this  subinterval  and  in  adjacent  childless 
subintervals,  and  add  to  far-held  values. 

endif 

enddo 
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Following  is  a  complexity  analysis  of  Algorithm  2.3.  Before  presenting  a  step-by- 
step  breakdown  of  the  operation  counts,  we  need  two  lemmas.  These  lemmas  provide 
upper  bounds  on  the  numbers  of  subintervals  which  can  be  created  by  the  process  of 
selective  subdivision. 

Lemma  2.11  For  any  subdivision  of  the  computational  domain  produced  by  Algo¬ 
rithm  2.3,  the  number  of  childless  intervals  is  bounded  by 

2  108,(1).^.  (2.91) 

Proof.  Each  parent  interval  at  level  /  contains  more  than  s  points  (otherwise  it  would 
not  be  further  subdivided).  Therefore,  the  total  number  of  parent  intervals  at  level 
/  is  bounded  above  by  N/s.  Each  parent  interval  has  two  children  by  definition,  so 
the  number  of  childless  boxes  at  any  level  /  is  bounded  above  by  2Nfs.  The  bound 
(2.91)  follows  from  the  fact  that  the  number  of  levels  of  subdivision  is  bounded  above 
by  log2(l/£)  (see  Remark  2.15).  □ 


Lemma  2.12  For  any  subdivision  of  the  computational  domain  produced  by  Algo¬ 
rithm  2.3,  the  total  number  of  intervals  is  bounded  by 

3  .  log,  (i)  .  (2.92) 

Proof.  The  number  of  parent  intervals  at  level  /  is  bounded  above  by  N/s.  Each  of 
these  parent  intervals  has  two  children,  so  the  number  of  childless  boxes  at  any  level  I 
is  bounded  above  by  2N/s.  Thus,  the  total  number  of  intervals  at  all  levels  (childless 
and  parent)  is  bounded  by  log2(l/£)  •  {N/s  2N/s).  □ 

Following  is  a  step-by-step  breakdown  of  the  operation  counts  of  Algorithm  2.3. 


Step  Complexity 

1  0{2pN)-h 

0{4p^mN/s) 

2  0{2pN)-\- 


Description 

Each  point  contributes  to  the  left  and  right  ^term  far-held 
expansions  of  the  childless  subinterval  in  which  it  lies. 

For  each  parent  subinterval  (there  are  at  most  mN/s)  we 
perform  4  p  x  p  matrix-vector  products. 

We  evaluate  and  add  the  left  and  right  ^term  local  expan¬ 
sions  for  each  of  the  N  points. 
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0{4p^mN/s)+  For  each  parent  subinterval  (there  are  at  most  mN/s)  we 
perform  4  p  x  p  matrix-vector  products. 

0{6p^mNfs)+  For  each  parent  subinterval  (there  are  at  most  mN/s)  we 
perform  at  most  6  p  x  p  matrix-vector  products. 

0{4pmN)  For  each  parent  subinterval  (there  <ire  at  most  mN/s)  we 

perform  at  most  4  p  x  s  matrix-vector  products. 

3  0{3Ns)  Each  of  the  yj  interacts  directly  with  those  Xk  which  lie  in 

its  own  subinterval  and  the  two  nearest  neighbors.  There 
are  at  most  3s  of  these  points  Xk- 

Total  0{14p^mN/s  +  4pmN  +  4pN +  3Ns) 

Remark  2.16  We  choose  s  «  2p,  as  in  Algorithm  2.2  (see  Remark  2.12).  The  above 

algorithm  then  requires  order 

N  •  (11pm  -t-  lOp)  (2.93) 

arithmetic  operations. 


2.4  FMM  for  Other  Kernels 


The  algorithms  described  in  Sections  2.2  and  2.3  are  all  designed  for  evaluating  sums 
of  the  form 


/(x)  =  E-^- 

fc=i  ^ 


(2.94) 


However,  these  algorithms  are  only  mildly  dependent  on  the  choice  of  kernel,  i.e.  they 
can  be  modified  to  evaluate  expressions  of  a  more  general  form,  given  by  the  formula 


N 


A:=l 


(2.95) 


where  is  singular  at  0  but  smooth  everywhere  else.  Examples  of  two  closely  related 
functions  with  this  property  are  ^(x)  =  log(x)  and  (j>{x)  =  1/x^,  which  are  readily 
obtained  by  integrating  and  differentiating  the  expression  (2.94).  Another  example 
of  interest  is  <^(x)  =  l/tan(x)  on  the  interval  [— f,f].  This  case  is  discussed  in 
more  detail  in  Chapter  3,  where  it  arises  in  the  formulation  of  the  trigonometric 
interpolation  problem. 
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Remark  2.17  While  the  algorithmic  procedures  for  different  kernels  4>  are  virtually 
identical,  different  sets  of  formulae  are  needed  for  the  creation  and  manipulation  of 
the  Chebyshev  far-field  and  local  expansions  which  are  required  by  the  algorithms. 
These  formulae  can  be  obtained  by  constructing  analogs  of  Theorems  2.5,  2.7,  2.8, 
2.9  and  2.10  for  the  particular  function  (f>. 


2.5  A  Fast  Algorithm  for  Polynomial  Interpola¬ 
tion 


In  this  section  we  return  to  the  original  problem  of  this  chapter:  given  a  set  of  points 
{ii, . . .  ,xn}  and  function  values  {/i, . . . ,  /n},  evaluate  the  unique  interpolating  poly¬ 
nomial  at  the  points  {yi, . . .  ,2/n}-  Recall  from  (2.2)  that  the  Lagrange  interpolating 
polynomial  defined  by  the  formula 

X  —  Xk 


pn(x) = e  /.  ■  n 

fc=:l 

k^] 


J=1  fc=:l 


can  be  rewritten  in  the  form 


N 


N 


N 


Furthermore, 


and. 


(2.96) 


(2.97) 


fcVl  k-\  ~^k 

k^J 

^  f  -S 

PN{yi)  =  n-i: 

yi  - 

(2.98) 

and  Sj  are  defined  by  the  formulae 

N 

(2.99) 

k=\ 

•  -'j 

(2.100) 

Observation  2.18  Sums  of  the  form  131n(x  —  Xk)  can  also  be  evaluated  using  the 
algorithms  of  this  chapter  (see  Section  2.4)-  The  numbers  {r/}  and  {sj}  can  therefore 
be  computed  in  0(iVlog(^))  operations  according  to  (2.99)  and  (2.100). 

Following  is  a  description  of  an  algorithm  for  the  efficient  evaluation  of  expressions 
of  the  form  (2.98). 
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Algorithm  2.4 

Step  Complexity  Description 

Init  0(JVlog(i))  Compute  the  numbers  {r/}  and  {sj}. 

1  0(N)  doi  =  l,n 

ffj  =  fj  ■ 

end  do 

2  C>(JV  log(i))  Compute  pi  =  QjliVl  —  ^j)  using  Algorithm  2.3  (or  Algorithm  2.2). 

3  0(N)  do/  =  l,n 

Pi  =  pi-  ri 

end  do 

Total  0(iVlog(i)) 

Remark  2.19  It  is  well  known  that  the  polynomial  interpolant  of  a  function  is  spec¬ 
trally  accurate  when  the  function  is  tabulated  at  Chebyshev  or  Legendre  nodes  (which 
axe  clustered  near  the  interval  ends),  whereas  the  interpolation  errors  can  be  arbi¬ 
trarily  large  when  the  function  is  tabulated  at  general  distributions  of  points  (see,  for 
example,  [9],  [23]).  It  is  expected  that  many  practical  applications  of  Algorithm  2.4 
will  assume  nonuniformly  spaced  nodes  which  are  clustered  neau  the  extremities  of 
the  interval. 

2.6  Applications  in  Numerical  Integration  and  Dif¬ 
ferentiation 

The  fast  polynomial  interpolation  algorithm  of  this  chapter  can  be  applied  to  a  variety 
of  problems.  One  such  example  is  discussed  in  this  section.  Here  we  will  consider  the 
folowing  problem:  given  a  set  of  points  {xi, . . . ,  xyv}  and  function  values  {/i, . . . ,  /yv}, 
evaluate  the  integrals  and  derivatives  of  the  interpolating  polynomial  at  the  points 
{x;t}.  In  other  words,  we  wish  to  compute 

j  Piv(x)dx  and  P't/ixk)  (2.101) 

for  k  =  1, . . . ,  A,  where  P^  is  the  interpolating  polynomial  for  the  function  values 
{fk}  at  the  points  {xjt},  defined  by  the  Lagrange  formula 

=  (2.102) 

We  will  make  use  of  the  following  lemma,  which  may  be  found  in  the  appendix 
to  [13].  This  lemma  describes  formulae  for  the  integration  and  differentiation  of 
Chebyshev  expansions. 
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Lemma  2.13  Let  P^  be  a  polynomial  given  by  a  Chebyshev  series 

N-l 

^  Ok-  Tk(x). 
kzzO 

Then,  the  integral  of  Pf^  has  a  series  expansion  of  the  form 


/  PN{t)dt  =  Yl^k-Tk{x), 

J-i 


where 


for  2  <  k  <  N, 


-  ^  •  (a*:-i  -  a/c+i) 
=  2  '  ~ 


and  the  derivative  of  P^  has  a  series  expansion  of  the  form 

£pf,(x)  =  Y^dk-n{x), 


where 


far  l<k<N  -2, 


}=k+l 
j+k  odd 


(2.103) 


(2.104) 


(2.105) 


(2.106) 


(2.107) 


Remark  2.20  It  can  be  shown  that  the  process  of  numerical  differentiation  via 
Chebyshev  series  has  a  condition  number  proportional  to  N"^,  whereas  the  process 
of  numerical  integration  via  Chebyshev  series  has  a  condition  number  bounded  by  2. 
Thus,  numerical  differentiation  of  this  type  is  not  usually  favored  when  laurge  scale 
calculations  axe  being  performed.  On  the  other  hand,  numerical  integration  is  vir¬ 
tually  insensitive  to  problem  size,  and  is  a  powerful  tool  in  the  solution  of  certain 
classes  of  differential  equations,  for  example  (for  a  more  detailed  discussion,  see  [15]) 

The  two  algorithms  described  below  perform  the  spectral  integration  and  differ¬ 
entiation  of  the  Lagrange  polynomial  interpolant  of  a  function  which  is  tabulated  at 
nodes  other  than  Chebyshev.  In  these  descriptions  we  will  assume  that  Xk  €  [— LI] 
for  A:  =  1, . . . ,  iV,  and  that  ti,. . .  ,tN  are  Chebyshev  nodes  of  order  N  on  the  interval 
[-1,1]. 
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Algorithm  2.5 

Step  Complexity  Description 

1  0(iVlog(i))  Interpolate  from  {life}  to  {4}  using  Algorithm  2.4. 

2  0{N  log  N)  Compute  Chebyshev  coefficients  using  fast  cosine  transform. 

3  0(N)  Integrate  Chebyshev  series  using  (2.105). 

4  0(N  log  N)  Evaluate  new  series  at  Chebyshev  nodes  using  fast  cosine  transform. 

5  0(iVlog(i))  Interpolate  from  {t/t}  to  {ijt}  using  Algorithm  2.4. 

Total  0(N  -logN  +  N-  log(i)) 


Algorithm  2.6 

Step  Complexity  Description 

1  0(iV log(l))  Interpolate  from  {ifc}  to  {t*}  using  Algorithm  2.4. 

2  0(N  log  JV)  Compute  Chebyshev  coefficients  using  fast  cosine  transform. 

3  0(N)  Differentiate  Chebyshev  series  using  (2.107). 

4  O(iVlogJV)  Evaluate  new  series  at  Chebyshev  nodes  using  fast  cosine  transform. 

5  0(Alog(l))  Interpolate  from  {tfc}  to  {ijt}  using  Algorithm  2.4. 

Total  0(N-logN  +  N-log(i)) 
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Chapter  3 

Trigonometric  Interpolation  and 
FFTs 


In  this  chapter  we  will  consider  the  transformation  F  : 
formulae 

N/2-1 

fWi  =  /i=  E 

k:=-N/2 

for  j  =  1, . . . ,  N,  where  i  =  {ii, . . . ,  i>f)  is  a  sequence  of  real  numbers  in  (— tt,  and 
a  =  {a-N/2,  ■  •  •  ,ots/2-i}  and  /  =  are  sequences  of  complex  numbers. 

We  axe  interested  in  the  efficient  application  and  inversion  of  the  transformation  F 
and  its  transpose.  More  precisely,  we  will  consider  the  following  four  problems: 

•  Problem  3,1:  Given  a,  find  /  =  F(q). 

•  Problem  3.2;  Given  a,  find  /  =  F^{a). 

•  Problem  3.3:  Given  /,  find  q  =  F~^{f). 

•  Problem  3,4:  Given  /,  find  a  =  (F^)~^(/). 

In  this  chapter  we  will  describe  a  group  of  four  efficient  algorithms  for  Prob¬ 
lems  3. 1-3.4.  These  algorithms  utilize  the  fact  that  a  Fourier  series  is  a  trigonometric 
polynomial;  when  dealing  with  the  values  of  this  polynomial  at  equispaced  nodes  on 
the  unit  circle,  the  FFT  can  be  applied.  However,  we  are  interested  in  the  values  at 
nonuniformly  spaced  nodes,  which  are  values  of  the  polynomial  which  interpolates  the 
equispaced  values.  The  algorithms  we  will  describe  rely  for  their  efficiency  on  a  com¬ 
bination  of  the  FFT  with  a  fast  algorithm  for  evaluating  trigonometric  polynomial 
interpolants  which  uses  a  version  of  the  Fast  Multipole  Method  (FMM)  specificcilly 


— ^  defined  by  the 

(3.1) 
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designed  for  the  geometry  of  the  circle.  This  interpolation  algorithm  is  closely  re¬ 
lated  to  Algorithm  2.4  described  in  Chapter  2  for  the  interpolation  of  polynomials 
tabulated  on  the  line. 

Remark  3.1  Throughout  this  chapter  we  will  be  using  the  well  known  Lagrange 
representation  of  polynomial  interpolants.  For  a  function  /  :  C  — >  C  tabulated  at 
nodes  zi, . . . ,  z^,  this  is  defined  by  the  formula 

(3.2) 

j=i  k=i 

This  chapter  is  divided  into  two  sections.  Section  3.1  contains  a  number  of  results 
from  analysis  and  approximation  theory,  and  in  Section  3.2  we  describe  both  formally 
and  informally  how  these  results  are  used,  together  with  the  FMM,  in  the  construction 
of  the  fast  £dgorithms  of  this  chapter. 

Remark  3.2  An  alternative  approach  to  the  problems  of  this  chapter  is  presented 
in  Chapter  4,  where  an  interpolation  scheme  based  on  the  Fourier  analysis  of  the 
Gaussi2m  bell  is  used  in  place  of  the  FMM-based  interpolation  scheme  of  this  chapter. 
The  two  approaches  are  compared  in  Chapter  6. 


3.1  Mathematical  and  Numerical  Preliminaries 

This  section  is  divided  into  two  parts.  In  Subsection  3.1.1  we  present  several  identities 
which  axe  employed  in  the  development  of  the  fast  algorithms  of  this  chapter.  Subsec¬ 
tion  3.1.2  contains  a  collection  of  error  bounds  which  allow  us  to  perform  calculations 
to  any  prescribed  accuracy. 

3.1.1  Analytical  Tools 

The  main  results  of  this  subsection  are  Theorems  3.3  and  3.4  which  describe  linear 
transformations  connecting  the  values  of  a  Fourier  series  at  two  distinct  sets  of  points. 
Lemma.s  3.1  and  3.2  provide  intermediate  results  which  are  used  in  the  proofs  of  these 
theorems. 

Lemma  3.1  Let  {xi, . . .  ,x;v}  and  {j/i, . . .  ,yA  }  be  sequences  of  real  numbers  on  the 
interval  [— tt,  tt],  and  let  {u?!, . . . ,  iutv}  and  {zi, . . . ,  z;v}  be  sequences  of  complex  num¬ 
bers  defined  by  the  formulae 
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/or  j  =  1, . . .  ,iV.  Then, 
N 


n(^'  “  •  n  -  yit)/2) 


t=i 

k^i 


fc=l 

k^j 


for  I  =  l,...,N,  and 


N  N 

_  JN-1)/2  -tt  1/2 


n(^i  -  ^k)  =  2j 


n  V  •  2*  •  siniiyj  -  yk)/2) 


k=l 

k^j 


A:=l 

k^j 


for  3  = 


(3.5) 


(3.6) 


Proof.  A  sequence  of  simple  algebraic  manipulations  and  trigonometric  identities 
yields 


N 


N 


fcsl 
k^i 

N 


k=l 

k^i 


k=l 

k^j 


N 


~  g«{^  i)ii/2  .  J2  g»»*/2 . 2i .  sin((i/  —  yk)/2)  (3-7) 


(7V-l)/2 

=  w)  " 


*=1 

k^j 

N 


n  ■  2i  ■  sin((x/  -  yk)/2). 


k=l 

kjtj 


Substituting  Zj  for  wt  cind  yj  for  i/  in  (3.7),  we  also  obtain 

n  (^J  ~  ^k)  =  ■  n  ■  2i  ■  sin((yj  -  yk)l2). 


k=l 

kltj 


k=l 

klij 


(3.8) 


□ 


The  following  lemma  describes  an  alternative  representation  of  the  well  known 
Lagrange  interpolation  formula  for  polynomials  in  the  case  when  the  interpolation 
points  lie  on  the  circle. 
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Lemma  3.2  Let  {xi,...,X7v}  and  be  sequences  of  real  numbers  in  the 

interval  [— x,5r],  and  let  {fi,  ■  ■  ■ ,  Jn}  be  a  sequence  of  complex  numbers.  Further, 
let  {iwi, . . .  and  {zi, . . . ,  zjsi}  be  sequences  of  complex  numbers  defined  by  the 

formulae 


Wj  = 

z,  = 


(3.9) 

(3.10) 


for  j  =  1, . . .  ,N .  Then, 

f  TT  nil  —  Zk  N/2 

Zlfrliz-—  = 

j=i  k=i 


j=i  \ 


1 


tan((x;  -  yj)/2) 


-A3.11) 


where  {c/}  and  {dj}  are  -^^fined  by  the  formulae 


IV 

C/  =  n  sin((x/  -  yk)/2), 

k=\ 

~  feUi  sin((y_^  -  j/fc)/2) 


forj,l=  1,...,N. 

Proof.  Dividing  (3.5)  by  (3.6)  we  obtain 


N 

n 


wi  -  Zk  wl'^  SL  z]!"^  •  2i  •  sin((x/  -  j/jfe)/2) 


n 


~  •  sin((yj  -  yfc)/2) 


e  »^)/2  «;f''^nfeLisin((x,  -  yfc)/2) 


sin((x/  -  y_,)/2)  sin{{yj  -  yk)/2)  ’ 

and  the  combination  of  (3.14)  with  the  fact  that 

e-«(^t-»j)/2  cos((x<  -  yj)/2)  +  i  sin((x/  -  yj)/2) 


sin((x/  -yj)/2) 


sin((xi  -  yj)/2) 


1 


tan((x/  -  yj)/2) 


—  t. 


1 


(3.12) 

(3.13) 


(3.14) 


gives  us 

^  ^  N  f 

t  rr  ~  ^k  N/2  ^  r  -Nf2  J  I 

■'  ^  \tan((x, -yj)/2) 


(3.15) 


-  i  (3.16) 
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where  {c/}  and  {dj}  are  defined  by  (3.12)  euid  (3.13).  □ 

The  following  theorem  provides  a  formula  for  determining  the  values  of  a  Fourier 
series  at  a  set  of  points  in  terms  of  the  values  of  this  series  at  another  set  of  points. 

Theorem  3.3  Let  {xi,...,x^}  and  sequences  of  real  numbers  in 

the  interval  [— 7r,7r],  and  let  -  ®  sequence  of  complex  numbers. 

Further,  let  {wi,...,wn},  {zi, . . .  and  {gi,...,gN}  be  sequences 

of  complex  numbers  defined  by  the  formulae 


Wj  =  e*''-’ 

(3.17) 

(3.18) 

N/2-1 

f,  =  z 

(3.19) 

k=-N/2 

N/2-1 

9,  =  Z 

(3.20) 

k=-N/2 

for  j  =  I,. . .  ,N.  Then, 


N  / 

91  =  Cl -Yl  fj  • 


-  Vtan((x,-y,)/2)  7’ 

where  {c/}  are  defined  by  (3.12)  and  {dj}  are  defined  by  (3.13). 

Proof.  Let  the  polynomial  Pa  be  defined  by  the  formula 

N-\ 

Pc{z)  =  Y  ^k-N/2  ■  2*- 

fc=0 


(3.21) 


(3.22) 


The  Lagrange  interpolation  formula  relates  the  values  of  Pa  at  the  points  {wi}  to  the 
values  at  the  points  {zk}  via  the  expressions 


i=i  fc=i 


Wl  -  Zk 


k=l 


for  1  =  1, . . . ,  AT,  and  applying  Lemma  3.2  to  (3.23)  we  obtain 


Pa{wi)  =  -CfY  PaiZj)  '  Zj  ( 

j=l  V 


tan((x/  -  t/j)/2) 


(3.23) 


(3.24) 
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From  the  combination  of  (3.22)  and  (3.17)-(3.20)  we  see  that 

N/2-1 

Pay^i)  —  Zj  2_^  Z^  —  Zj  Jj 

k=-N/2 

N/2-1 


and 


Pa{wi)  =  w^^^-  Y,  otk  ■  wf  =  ■  gi, 

k=-N/2 

and  finally  substituting  (3.25)  and  (3.26)  into  (3.24)  we  obtain 


9i 


N  / 

=  Cl -Yfr  •  ( 
j=i  V 


1 


tan((x;  -  yj)/2) 


—  i 


(3.25) 


(3.26) 


(3.27) 


for  /  =  1, . . . ,  iV. 


In  the  case  when  the  points  {yj}  are  equispaced  in  [— x,7r],  the  interpolation  for¬ 
mula  of  Theorem  3.3  has  a  simpler  form,  which  is  described  in  the  following  theorem. 
The  result  of  this  theorem  can  be  found  in  a  slightly  different  form  in  [12]. 

Theorem  3.4  Let  {zi, zjv}  be  a  sequence  of  real  numbers  on  the  interval  [— ?r,  t] 
and  let  {oc.n/2^  •  ■  cxN/2-i}  be  a  sequence  of  complex  numbers.  Further,  let  {yi, . .  • ,  J/n} 
be  a  sequence  of  real  numbers  defined  by  the  formulae 

y,  =  (j  -  1  -  N/2)x/N  (3.28) 

for  j  =  1,...,N,  and  let  {wi,...,wn},  {zi,...,zn),  and 

he  sequences  of  complex  numbers  defined  by  the  formulae 


Wj  =  e'""’ 

(3.29) 

Zj  = 

(3.30) 

N/2-l 

f,  =  Z 

(3.31) 

k=-N/2 

N/2-1 

»  =  E 

(3.32) 

k=-N/2 

for  j  =  1, . . . ,  N .  Then, 

2  j  ^  ^  N  (tan((i,  -  yj)/2)  /’ 

(3.33) 
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Proof.  From  the  combination  of  equations  (3.28)  and  (3.31),  we  see  that  the  sequence 


{oi-N/2,  ■  ■  ■  ,OtN/2- 

-i}  is  the  discrete  Fourier  transform  of  the  sequence  {fi,  ■  ■ 

•  in 

other  words, 

}=i 

(3.34) 

for  k  =  —N/2, . . . 

.  ,N/2  —  1.  Let  us  now  define  the  fimction  /  :  [— 7r,7r]  — > 

C  by  the 

formula 

N/2-1 

/(x)=  Y. 

(3.35) 

k=-N/2 


Substituting  (3.34)  into  (3.35)  and  changing  the  order  of  summation,  we  obtain 


N/2-1  ,  N 

k=-N/2  ■'  j=l 


N 


1 


N/2-1 


=  Zfrjj-  E 


(3.36) 


i=l  '' '  k=-N/2 

Observing  that  the  second  sum  in  the  expression  (3.36)  is  a  geometric  series  we  have 


e-«A^(x-v,)/2  _  ,N(x-y,)/2 
Y]  e*Mx-yj)  _  ^  ^ 

k=-N/2 


1  _ 

sm{N{x  -  yj)/2) 


e»(x  ■  s\n{{x  —  yj)/2) 

=  (sin(7V(x  -  yj)/2))  ■  (cot((x  -  yj)/2)  -  i) 


(3.37) 


for  any  x  €  [— 7r,7r].  The  definition  of  {yj}  now  yields 


sin(iV(x  —  yj)/2)  =  s\n{Nx/2)cos{Nyj/2)  —  cos{Nx/2)sm{Nyj/2) 

=  sm{Nx/2)-{-iy,  (3.38) 


and  finally,  using  the  fact  that  gi  =  f{xi)  and  combining  (3.36),  (3.37)  and  (3.38)  we 
obtain 


fNxi\ 


N 

r 


(-1V  ( 


/  r%  rt/\\ 
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3.1.2  Relevant  Facts  from  Approximation  Theory 

The  algorithms  of  this  chapter  are  based  on  several  results  from  the  Chebyshev  ap¬ 
proximation  theory  of  the  function  l/tan(x).  These  results  are  contained  in  the 
lemmas  and  theorems  of  this  subsection,  numbered  3.8-3.'  4.  Analogs  of  these  results 
for  the  function  1/x  can  be  found  in  Section  2.1. 

The  main  results  of  this  section  fall  into  two  categories.  Theorems  3.11  and  3.14 
describe  how  the  function  l/tan(i)  can  be  approximated  on  different  regions  of  the 
interval  [— f,f]  using  Chebyshev  expansions.  Theorems  3.15,  3.16  and  3.17  provide 
three  ways  of  manipulating  these  expansions  which  are  needed  by  the  fast  algorithms 
of  this  chapter. 

We  begin  with  three  classical  definitions  which  can  be  found,  for  example,  in  [14], 
[23]. 


Definition  3.1  The  n-th  degree  Chebyshev  polynomial  T’n(x)  is  defined  by  the  follow¬ 
ing  equivalent  formulae: 


Tn{x)  =  cos(narccosx)  (3.40) 

Tn{x)  =  i  •  ((x  +  +  (x  -  .  (3.41) 


Definition  3.2  The  roots  tj,.  .,tn  of  the  n-th  degree  Chebyshev  polynomial  Tn  lie 
in  the  interval  [—1,1]  a,nd  are  defined  by  the  formulae 


tk^ 


cos 


'2k -1 


n 


for  k  =  1, . . . ,  n.  They  are  referred  to  as  Chebyshev  nodes  of  order  n. 


(3.42) 


Definition  3.3  We  will  define  the  polynomials  of  order  n  —  I  by  the  for¬ 

mulae 

=  n  (3-43) 

/t=i 

for  j  =  1, . . .  ,n,  where  tk  a,re  defined  by  (3.42). 

The  order  n  —  1  Chebyshev  approximation  for  a  function  /  :  [—1,1]  — ^  C  is  defined 
as  the  unique  polynomial  of  order  n  —  1  which  agrees  with  /  at  the  nodes  ti, . . . 
There  exist  several  standard  representations  for  this  polynomial,  and  the  one  we  will 
use  in  this  chaptei  is  given  by  the  expression 

EMO-MO- 

j=l 


(3.44) 
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For  the  purposes  of  this  chapter,  Chebyshev  expansions  for  any  function  will  be 
characterized  by  values  of  this  function  tabulated  at  Chebyshev  nodes. 

Lemmas  3.5-3. 7  provide  estimates  involving  Chebyshev  expansions  which  are  used 
in  the  remainder  of  this  section.  The  proof  of  Lemma  3.5  is  obvious  from  (3.40). 

Lemma  3.5  Let  Tn{x)  be  the  Chebyshev  polynomial  of  degree  n.  Then, 

\Tnix)\  <  1  (3.45) 


for  any  x  £  [—1, 1]. 

Lemma  3.6  Let  Tn{x)  be  the  Chebyshev  polynomial  of  degree  n.  Then, 


IWI  > 


5i 


(3.46) 


for  any  x  such  that  jil  >  3. 

Proof.  From  Definition  3.1,  we  have 

1 


\Tnix)\  =  --lix+v^^r+ix-vx^ri 

>  5 .  ii + v/i*  -  (x/3pr  =  5  ■  ii  ■  (1  +  v^)r 


(3.47) 


1 

>  2- 

for  any  x  such  that  |x|  >3. 


5x 


Lemma  3.7  Let  Uj{x)  be  defined  by  (3.43).  Then,  for  any  x  €  [—1, 1], 

|u,(x)|  <  1.  (3.48) 

Proof.  It  is  obvious  from  (3.43)  that  Uj{tj)  =  1,  and  that  Uj{tk)  =  0  when  k  j.  In 
addition,  the  expression 

-  Y.  nit,)  ■  nix)  (3.49) 

”  k=i 

is  also  equal  to  1  at  tj  and  equal  to  0  at  all  other  t*.  Since  both  Uj  and  (3.49)  are 
polynomials  of  order  n  —  1,  we  have 

u,ix)  =  -Ynit,)-nix) 

”  fc=i 


(3.50) 
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for  j  —  Furthermore,  due  to  the  combination  of  (3.50)  and  the  triangle 

inequality,  we  obtain 

E  Tk{tj)  ■  Tk{x)  •  in(x)l  <  1  (3.51) 

"  fc=i  ”  k=i 

for  any  X  e  [-1, 1].  □ 

The  next  lemma  is  obvious. 

Lemma  3.8  For  any  a  €  [0,  |], 

tan  3a  >  3  •  tan  a.  (3.52) 

The  following  two  lemmas  provide  preliminary  results  which  are  used  in  the  proof 
of  Theorem  3.11. 


Lemma  3.9  Suppose  that  n  >  2,  and  that  6  >  0  and  Xq  are  real  numbers  with 
|xol  >  Zb.  Then,  for  any  x, 

t  ^ (I)  -  (1  +  x5) .  (3.53) 


Proof.  Let  Q{x)  by  the  polynomial  of  degree  n  defined  by  the  formula 

It  follows  from  the  combination  of  (3.43)  and  (3.54)  that 

Qibh)  =  l  +  bhxo-{btk-Xo)-j2\~^-Uj{h) 

j=i  ^0 

=  l  +  bhxo-{bt,-xo)~-^  =0, 

otk  -  Xo 

for  k  =  1, . . .  ,n.  Clearly,  then,  Q{x)  satisfies  the  conditions 


(3.54) 


(3.55) 


Q{xo)  =  1+  Xq 

Qibt^)  =  0 
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It  is  clear  that  the  function 


(1  +  4) 


(3.57) 


Tnixo/b) 

is  also  a  polynomial  of  degree  n  which  satisfies  the  n  + 1  conditions  (3.56).  Therefore, 


Q{x)  =  (1  +  Xo) 


U^/b) 


T^ixolbY 

and  (3.53)  follows  as  an  immediate  consequence  of  (3.54)  and  (3.58). 


(3.58) 


Lemma  3.10  Suppose  that  n  >  2,  and  that  6  >  0  and  xq  are  real  numbers  with 
|xol  >  35.  Then, 


1  +  XXO  ^1  +  btjXo  /'x\ 


1+952 

5-5" 


X  Xq  j”!  ^0 

for  any  x  €  [~5, 5]. 

Proof.  Dividing  (3.53)  by  (i  -  xo)  and  taking  absolute  values,  we  obtain 


1  +  xxq  ^  1  +  btjXp  ^  /x\ 

X  —  Xq  ^  btj  —  Xo  \bj 

Due  to  Lemmas  3.5  and  3.6  we  have 

|rn(x/5)l  <  1 

for  any  x  €  [—5,5],  and 


i  +  xg  |r„(x/5)| 


|x-xo|  |r„(xo/5)|’ 


1  +  4 


T„(xo/5) 


<(1+4) -2 


35 


5xo 


<  ^  •  (1  +  (35)^) 


(3.59) 


(3.60) 


(3.61) 


(3.62) 


for  any  |xol  >  35.  Finally,  substituting  (3.61)  and  (3.62)  into  (3.60),  we  obtain 

1  +  952 


1  +  xxq  ^  1  +  btjXp  /x\ 
X  —  Xo  ^1  btj  —  Xo  ^  \  bj 


5  •  5" 


(3.63) 


for  any  x  €  [—5,5]. 
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Theorem  3.11  Suppose  that  n  >  2,  and  that  a  and  do  are  real  numbers  with  0  < 
a  <  ^  and  3a  <  |^o|  <  f  •  Then, 

1 

tan(0  —  6o) 
for  any  9  £  [—a,  a]. 


1  +  tj  tan  a  tan  di 
^  tj  tan  a  —  tan  Oq 


-E 


'o  /tan0\ 
I  ^  \tana J 


1  +  9tan^a 
tan  a  •  5" 


(3.64) 


Proof.  Let  9  £  [—a,  a].  Then,  defining  the  real  numbers  b,  x,  and  xq  by  the  formulae 
b  =  tan  a,  X  =  tan  9  and  Iq  =  tan  9o,  we  observe  that  |x|  <  6,  and,  due  to  Lemma  3.S, 
Ixol  ^  tan  3a  >  36.  We  also  observe  that 


1 


1  +  tan  9  tan  1  + 


tan(0  —  ^o)  tan  9  —  tan 


xo 


(3.65) 


and 


^  1  +  t j  tan  a  tan 


X  "T  ua 

^  tjtana 


Uj 


'^]=±  i±ii^  .  f  £)  .  (3.66) 

tana/  ^  tjb  —  Xo  ^  KbJ 


tan  ^0  V  tan  i 

It  follows  from  the  combination  of  equations  (3.65)  and  (3.66)  and  Lemma  3.10  that 


1  ^  1  +  tjtanatan^o  /tan^\ 

tan(^  —  ^o)  ^  fjtana  — tan^o  ^\tana/ 

1  +  xxo  ^  1  +  tjbxo  /x\ 


< 


X  ““  Xq 

1  +  962 
6-5" 


j^i  tjb-xo 
1  +  9  tan^  a 
tan  a  ■  5" 


(3.67) 


for  any  9  £  [—a, a]. 


□ 


The  following  two  lemmas  provide  preliminary  results  which  are  used  in  the  proof 
of  Theorem  3.14. 


Lemma  3.12  Suppose  that  n  >  2,  and  that  6  >  0  and  xq  are  real  numbers  with 
|xo|  <  6.  Then,  for  any  x. 


n 

X  +  36xo  —  (36  —  xxo)  •  ^ 

i=i 


tj  +  36xo 
36  ““  tjXo 


Uj{x) 


'  OL  ^ 

—  +  36x0 
.3:0  } 


Tnjx) 

TniSb/xoY 


(3.68) 
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Proof.  Let  Qix)  be  the  poly,  jmial  of  degree  n  defined  by  the  formula 

Q(x)  =  X  +  3bxo  —  (35  —  xxq)  ■  ^  +  35io  (3.69) 

"  36  -  tjXo 

It  follows  from  the  combination  of  (3.43)  and  (3.69)  that 
Q{tk)  =  tk  +  3bxo  -  (36  -  tkXo)  ■  51 

36  -  tjXo 

=  tk  +  3bxo  -  (36  -  tkXo)  •  =  0,  (3.70) 

60  —  tkXo 

for  k  =  1, . . .  ,n.  Clearly,  then,  Q{x)  satisfies  the  conditions 

Q{3b/xo)  =  36/xo  +  36xo 

Q{ti)  =  0 

i  (3.71) 

Q{tn)  =  0. 


It  is  clear  that  the  function 


'U 


+  36x( 


')'t„ 


Tn(x) 


(36/xo) 


(3.72) 


is  also  a  polynomial  of  degree  n  which  satisfies  the  n  +  1  conditions  (3.71).  Therefore, 


(36 

- 1-  36xo 

xo  > 


W 

Tn(3b/xo)' 


(3.73) 


and  (3.68)  follows  as  an  immediate  consequence  of  (3.69)  and  (3.73).  □ 


Lemma  3.13  Suppose  that  n  >  2,  and  that  6  >  0  and  xq  are  real  numbers  with 
|xo|  <  6.  Then, 


X  +  36xo  ^  tj  +  36xo 
36  —  xxo  ^  36  —  tjXo 


Uj{x) 


3(1+6^) 
6  •  5” 


for  any  x  €  [—1, 1]. 


(3.74) 
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Proof.  Dividing  (3.68)  by  (36  —  xxq)  and  taking  absolute  values,  we  obtaun 


X  +  36xo  ^  tj  +  36xo  ,  . 
-  2^  . 


1 


|36  —  xxol 


|36  — xxo  ^36  — tjXo 
In  addition,  due  to  Lemmas  3.5  and  3.6  we  have 

ir„(i)i  <  1 

for  any  x  €  [—1, 1],  and 


36  OL 
—  +  36xo 
xo 


|rn(a:)| 

|r„(36/xo)r 


36/ xq  d"  36xo 


r„(36/xo) 


<  36  ■  (1/xo  +  xo)  •  2 


3xo 


5-36 


<|  .  (1/6+1.) 


for  |xo|  <  6.  Substituting  (3.76)  and  (3.77)  into  (3.75),  we  obtain 


X  +  36xo  ^  tj  +  36xo  ,  , 

-  2]  ^r— n- •  “i(^) 


1 36  XXq 

for  any  x  €  [—1, 1]. 


,  1  66  3(1  +  6*) 


(3.75) 


(3.76) 


(3.77) 


(3.78) 


□ 


Theorem  3.14  Suppose  that  n  >  2,  and  that  a  and  Oo  o.re  real  numbers  with  0  < 
a  <  I  and  |0o|  <  a-  Then, 

1 

tan(^  —  ^o) 
for  any  6  such  that  3a  <  |0|  <  ^. 

Proof.  Let  6  be  any  real  number  such  that  3a  <  |0|  <  ^.  Then,  defining  the  real 
numbers  6,  x  and  xq  by  the  formulae  6  =  tan  a,  x  =  3  tan  a/  tan  9  and  xq  =  tan  we 
observe  that  |xo|  <  6,  and,  due  to  Lemma  3.8,  |x|  <  1.  We  also  observe  that 

1  1  +  tan  6  tan  Op  1  +  36xo/ x  _  x  +  36xo  r*?  sn'> 

tan(0  —  6o)  tan  6  —  tan  6o  36/ x  —  xq  36  —  xxq  ’ 


-E 


tj  +  3  tan  a  tan  00  /3tana 


3  tan  a  —  tj  tan  0 


}q  /3tana\ 

>0  \  tan  0  / 


< 


sin  2a  •  5” 


(3.79) 


^  tj  +  3tanatan0o  /3tan  a\  _  ^  tj  +  36xo  ,  . 

^  3  tan  a  —  tj  tan  0o  \  tan  0  /  ^  36  —  tjXo  ^ 


(3.81) 


and 
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It  follows  from  the  combination  of  equations  (3.80)  and  (3.80)  and  Lemma  3.12  that 

1  ^  tj  +  3  tan  a  tan  6q  /3  tan  a \ 

tan(d  —  ^o)  ^  3tan  a  —  tan  ^0  \  tand  / 

\x 


<  3- 

for  any  9  such  that  3a  <  |^|  <  ^.  □ 

The  following  three  theorems  provide  formulae  for  translating  along  the  inter¬ 
val  [— f,f]  Chebyshev  expansions  of  the  type  described  in  the  previous  two  the¬ 
orems.  Theorem  3.15  provides  a  formula  for  translating  expansions  described  in 
Theorems  3.11,  Theorem  3.16  describes  a  mechanism  of  converting  the  expansion  of 
Theorem  3.14  to  the  expansion  of  Theorem  3.11,  and  Theorem  3.17  provides  a  way 
of  translating  the  expansion  of  Theorem  3.14. 


-I-  36xo  ^  tj  -t-  36xo  ,  . 

-22ol  .  -  • 


^  XXq  j=l  tjXQ 

1  +  3  sec*  a 


b  •  5’*  tan  a  •  5”  sin  2a  •  5” 


(3.82) 

(3.83) 


Theorem  3.15  Suppose  that  n^N  >  2,  and  1st  a,c,d  be  real  numbers  such  that 
0  <  a  <  7r/8  and  [c  —  d,  c-f  </|  C  [—a,  a].  Let  the  function  f  :  [— f,f]  —*  C  be  defined 
by  the  formula 


s 

m  =  E 


Un{d-0k) 


(3.84) 


where  3a  <  |dfc|  <  ^  for  k  =  1, . . . ,  TV,  and  oi, . . . ,  ayv  is  a  set  of  complex  numbers. 
Further,  let  ...  ,'^n  be  a  set  of  complex  numbers  defined  by  the  formula 


= /(arctan(tfc  tan  a))  (3.85) 

fork  =  1, . . . ,  n,  and  let  . . .  ,^n  be  a  set  of  complex  numbers  defined  by  the  formula 

^lan(c  -f  arctan(tfc  tan  d))  \ 


j=i 


tan  a 


(3.86) 


for  k  =  \, . . .  ,n.  Then,  for  any  9  £  (c  —  d, c  -f  d], 


<  A 


(n  -I- 1)(1  -I-  9  tan*  a) 
tan  a  •  5” 


(3.87) 


where  A  =  l“*l- 
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Proof.  It  follows  from  the  triangle  inequadity  that 


-  S  f 

*=1  \ 


tan(^  —  c) ' 
tan^ 


<  5i  +  Sj 


(3.88) 


where 


Si  =  fiO)  -  f(c  -f  arctan(tfc  tan  d))  •  Uj  j  ’  (3.89) 


S2=  +  ^ctan(tit  tan  <£))- 'i'jt)  •  Uj  • 


(3.90) 


Combining  Theorem  3.11  with  Lemma  3.7  and  the  triangle  inequality,  we  have 


Si<  A- 


1  +  9  tan^  a 
tan  0-5" 


(3.91) 


52  <  ^  /(c  + arctan(tfctand)) 

Jk=l 

^  1 +  9tan^  a 


j=i  \ 


tain(c  +  ajctan(tt  tan  d)) ' 
tana 


<  An 


tain  a  •  5"  ’ 


(3.92) 


where  A  =  \<^k\-  Finally,  substituting  (3.91)  amd  (3.92)  into  (3.88)  we  obtain 


1  +  9tam^  a 
tan  a  ■  5" 


(3.93) 


for  any  9  ^  [c  —  d,c  +  <I\. 


Theorem  3.16  Suppose  that  n,N  >  2.  and  let  a,c,d  be  real  numbers  such  that 
0  <  a  <  ir/8  and  |c|  —  d  >  3a.  Let  the  function  f  :  [— |,  »  C  6e  defined  by  the 

formula 

fW  =  E  .■  tT  r,  <3.94) 

tan(0  -  6k) 

where  6k  €  [—a, a]  for  k  =  l,...,iV,  and  ai,...,QN  is  a  set  of  complex  numbers. 
Further,  /et  $i,. .  be  a  set  of  complex  numbers  defined  by  the  formula 


$jt  =  /(arctan(3tan(a)/tfc)) 


(3.95) 
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for  k  =  1, . . .  ,n,  and  let  . . .  ,'^n  be  a  set  of  complex  numbers  defined  by  the  formula 

3  tan  a 


tan(c  +  £irctan(ti  tan «/)) 
for  k  =  1,. . .  ,n.  Then,  for  any  €  [c  —  d,c  +  d\, 


) 


(3.96) 


<A 


where  A  = 

Proof.  It  follows  from  the  triangle  inequality  that 


3n  sec^  a  +  1  4-  9  tan*  a 


tan  a  •  5" 


(3.97) 


<  +  ^2 


where 


and 


5i  = 


f{6)  —  ^  /(c  +  arctan(tfc  tand))  •  uj 


S2  = 


5^(/(c  +  arctan(tjt  tand))  -  ^fc)  •  u_, 


/t=i 


Combining  Theorem  3.11  with  the  triangle  inequality  gives  us 

1  +  9  tan*  a 


Si<A- 


tan  0-5'*  ’ 


(3.98) 

(3.99) 

(3.100) 

(3.101) 


and  from  the  combination  of  Theorem  3.14,  Lemma  3.7  and  the  triangle  inequality, 
we  have 


52  <  £ 


*=1 

<  An  • 

>7V 


f{c  +  arctan(tfc  tan  d))  — 

j=i 

3  sec*  a 
tan  a  •  5" ’ 


3  tan  a 


^tan(c  +  arctan(tfc  tand))^ 


(3.102) 


where  A  =  I22=i  Finally,  substituting  (3.101)  and  (3.102)  into  (3.98)  we  obtain 


for  any  ^  €  [c  —  d,  c  +  d]. 


<  A- 


3n  sec*  c  +  1  +  9  tan*  a 
tan  a  •  5” 


(3.103) 
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Theorem  3.17  Suppose  that  n,N  >  2,  and  let  a,c,d  be  real  numbers  such  that 
0  <  a  <  7r/8  and  [c  —  d,c  +  d\  D  [—a, a].  Let  the  function  f  :  |]  — >  C  6e  defined 

by  the  formula 


N 

m  =  E 


Olk 


(3.104) 


where  6k  €  [—a, a]  for  k  =  and  Oi,...,qjv  is  a  set  of  complex  numbers. 

Further,  let  ...  be  a  set  of  complex  numbers  defined  by  the  formula 


^k  =  /(arctaLn(3tan(a)/4))  (3.105) 

for  fc  =  1, . . . ,  n,  and  /ci  $i, . . . ,  be  a  set  of  complex  numbers  defined  by  the  formula 

3  tan  a 


i=i 


i) 


^tan(c  +  arctan(3tan((i)/tjt)), 
for  fc  =  1, . . , ,  n.  Then,  for  any  6  such  that  —  c|  >  3d, 

3(n  +  1)  sec^  a 


<  A 


h-S‘‘ 

where  A  =  |afc|- 

Proof.  It  follows  from  the  triangle  inequality  that 


t,  ‘Wme-c)^ 


tail  a  ■  5” 


^  •S'l  -f-  1S2 


where 


5i  = 


/(^)  ~  ^  /(^  arctan(3tan(<f)/4)) 
*=1 


/  3  tan  d  N 
^  \tan(^  —  c)) 


and 


5,= 


^  (/(c  +  arctan(3tan(d)/4))  -  4>it)  •  u_,  [ 

t=i  V 


3  tan  d 
\tan(0  —  c)^ 


(3.106) 


(3.107) 


(3.108) 


(3.109) 


(3.110) 


Combining  Theorem  3.14  with  Lemma  3.7  and  the  triangle  inequality,  we  have 

3  sec^  a 


Si  <  A 


tan  a  •  5” ’ 


(3.111) 
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52  <  53  /(c  +  axctan(3tan(</)/tfc)) 

k=l 

3  sec^  a 


i=i  V 


3  tan  a 

tan(c  +  axctan(3tan((/)/tit))^ 


<  An 


tan  a  •  5" 


(3.112) 


where  A  =  lat].  Finally,  substituting  (3.111)  aud  (3.112)  into  (3.108)  we  obtain 

m -2^)  <  x ■  (3.113) 

^  ^  \tan(d  —  c)  J  tan  a  •  5" 

for  any  6  such  that  \6  —  c\  >  3d.  □ 


3.2  Application  of  the  FMM  to  Nonequispaced 
FFTs 

This  section  consists  of  four  parts.  In  Subsection  3.2.1  we  describe  briefly  how  the  one 
dimensional  Fast  Multipole  algorithm  of  Chapter  2  can  be  applied  to  the  problems  of 
this  chapter,  in  Subsection  3.2.2  we  outline  a  set  of  four  algorithms  for  these  problems, 
Subsection  3.2.3  contains  more  formal  descriptions  of  these  algorithms,  and  finally  in 
Subsection  3.2.4  we  discuss  a  generalization  of  Problems  3. 1-3.4. 


3.2.1  FMM  and  Trigonometric  Interpolation 

There  exist  a  number  of  different  formulations  of  the  trigonometric  interpolation 
problem  (see  [23]).  The  version  we  will  use  for  the  purposes  of  this  chapter  is  described 
«is  follows:  given  a  set  of  points  {yi, . . . ,  and  function  values  {/i, . . . ,  /n},  evaluate 
the  interpolating  Fourier  series  at  the  points  {xi, . . . ,  xyv}-  According  to  Theorem  3.3, 
these  values  are  given  by  the  formulae 


N  / 

=  c/  •  £  /j  •  dj  ■  I 

j=i  V 


tan((x/  -  yj)/2) 


(3.114) 


for  /  =  1, . . . ,  A^,  where  {c/}  and  {dj]  are  defined  by  the  formulae 
c/  =  n  sin((x/  —  yit)/2)  = 


(3.115) 
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for  /  =  1, . . . ,  iV,  and 


Pi  -  yk)/2) 


(3.116) 


for  j  =  1 , . . . ,  AT. 


Remark  3.3  The  FMM  algorithms  of  Chapter  2  axe  designed  to  evaluate  expressions 
of  the  form 


h^^-^k 


(3.117) 


in  O  (iVlog  arithmetic  operations,  where  e  is  the  desired  accuracy.  With  a  few 
minor  modifications  they  can  also  be  used  to  evaluate  expressions  of  the  form 


A  Qk 
tan(x  -  Xk) 


(3.118) 


ln(sin(r  -  xfc)), 

fc=i 


(3.119) 


and  hence  expressions  of  the  form  (3.114)  for  the  same  computational  cost.  Moreover, 
the  algorithmic  procedure  for  the  kernel  1/ tan  x  is  virtually  identical  to  that  for  1/x, 
and  the  various  expansions  required  by  the  algorithms  are  manipulated  via  Theo¬ 
rems  3.15,  3.16  and  3.17  (see  Chapter  2  for  detailed  descriptions  of  these  algorithms). 

3.2.2  Informal  Descriptions  of  the  Algorithms 

In  this  subsection  we  outline  how  a  fast  trigonometric  interpolation  scheme  can  be 
used  to  construct  efficient  algorithms  for  Problems  3. 1-3.4  of  this  chapter. 

We  begin  with  some  notation. 

T  :  will  denote  the  matrix  which  maps  a  sequence  of  N  complex 

numbers  to  its  discrete  Fourier  transform.  T  is  defined  by  the  formulae 


_  g2x,  0-N/2-l)  (fc-N/2-l)/N 

for  j,  A:  =  1, . . . ,  A^,  and  it  is  well  known  that  =  T ^  and  that 


(3.120) 


Remark  3.4  T  and  T  ^  can  each  be  applied  in  0{N  log  N)  operations  via  the  FFT. 
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V  :  will  denote  the  matrix  which  maps  the  values  of  aii  A^-term  Fourier 

series  at  N  equispaced  points  {yi,...,y/v}  on  [— tt, tt]  to  the  values  of  this  series  at 
the  arbitrarily  spax;ed  points  {ij, . . .  ,ia'}.  According  to  Theorem  3.4,  V  is  defined 
by  the  formulae 


(3.121) 


(Nx,\ 

(-1)*  ( 

\  2  ) 

N  [ 

for  j,  A:  =  1, . . . ,  A^.  It  follows  directly  from  (3.121 )  that 


'pT  _ 


y  .  fNxk\  ( 


tan((xt  -  j/_,)/2) 


(3.122) 


for  j,k  =  1, . . . ,  A/.  The  inverse  of  the  mapping  V  converts  the  values  of  an  A^-term 
Fourier  series  at  the  points  {xi,...,x^}  to  the  values  of  this  series  at  the  equispaced 
points  {yi, . . . ,  y^r}.  V~^  is  therefore  given  analytically,  and  according  to  Theorem  3.4 
it  is  defined  by  the  formulae 


^)k  —  Cj  •  dk  • 


tan((yj  ~  Xfc)/2) 


(3.123) 


lox  j.k  =  1,...,A^,  where  ci,...,ca  and  d\ are  sequences  of  real  numbers 
defined  by  the  formulae  (3.115)  and  (3.116).  It  follows  directly  from  (3.123)  that 


=d,c. 


tan((y*  -  Xj)j2) 


(3.124) 


for  j,k  =  1, . . . ,  A^. 


Remark  3,5  V,  V^,  V~^  and  are  all  of  the  same  form,  and  each  can  be 

applied  with  a  relative  precision  e  in  O^A^log^^^  operations  via  the  FMM  (see 
Section  3.2.1). 

Observation  3.6  From  the  combination  of  (3.1),  Theorems  3.3  and  3.i,  and  several 
elementary  matrix  identities,  we  see  that 

F  = 

FT  =  jr.pT 

(3.125) 

(F^)-'  =  (F^)-’-F-'. 


Furthermore,  due  to  Remarks  3.4  und  3.5,  F ,  F^ ,  F  '  and  {F^)  ^  can  each  be  applied 
in  0  ^A^log  arithmetic  operations. 
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3.2.3  Formal  Descriptions  of  the  Algorithms 

Following  are  detailed  descriptions  of  the  four  algorithms  of  this  chapter. 

Algorithm  3.1 

Step  Complexity  Description 

1  0{N log  N)  Comment  [Evaluate  Fourier  series  at  equispaced  points  using  FFT.] 

Compute  Qj  =  Efc^i'Jv/2  "fc  ' 

2  0(iVlog(^))  Comment  [Interpolate  in  space  domain.] 

do  j  =  I,  N 

9]  =  9j  -i-iy/N 

end  do 

Compute  fi  =  Qj/  tan((x/  -  yj)/2)  for  /  =  1, . . .,  iV  using  FMM. 
do  /  =  l,iV 

end  do 

do  /  =  l,iV 

ft  =  ft-  sin(iVx//2) 

end  do 

Total  0(iV  •  log  iV  + TV  •log(i)) 

Algorithm  3.2 

Step  Complexity  Description 

1  0(Alog(M)  Comment  [Interpolate  in  frequency  domain.) 

do  j  =  l,iV 

ttj  =  Oj  •  sin(iVij/2) 

end  do 

Compute  G;  =  -  Ejli  tan((y;  -  Xj)/2)  for  /  =  1, . .  .,TV  using  FMM. 
do  /  =  1,TV 

Etv 

J=1 

end  do 

do  I  =  1,N 

ai  =  ai-i-iy/N 

end  do 

2  0(iV  log  TV)  Comment  [Evaluate  Fourier  series  at  equispaced  points  using  FFT.) 

Compute  fj  =  T,k=SN/2  “'t  ■  for  j  =  1, . . . ,  TV. 


Total  0{N  •  log  TV  +  TV  •  log(i)) 
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Algorithm  3.3 

Step  Complexity  Description 

1  C)(iV log(i))  Comment  [Interpolate  in  space  domain.] 

do  j  =  1,N 

h  =  h -dj 

end  do 

Compute  ai  =  fj! tan((y/  -  a:j)/2)  for  /  =  1, . . jV  using  FMM. 
do  /=  l,iV 

at  =  ai-i-  fj 
end  do 
do  /=  l,iV 

at  —  at  •  Cl 

end  do 

2  O(iVlogiV)  Comment  [Obtain  Fourier  coefRents  using  FFT.j 

Compute  aj  =  jj  ■  Yik=\  for  j  =  ~N/2, ...,  N/2  -  1. 

Total  0(N  -logN  +  N  -  logiD) 

Algorithm  3.4 

Step  Complexity  Description 

1  0(N  log  N)  Comment  [Obtain  Fourier  coeffients  using  FFT.j 

Compute  aj  =  jj  ■  Ylk=i  h  ■  for  j  =  1, . . . ,  iV. 

2  0(./Vlog(i))  Comment  [Interpolate  in  frequency  domain.] 

do  j  =  1,A^ 

end  do 

Compute  at  —  —  Oj/  tan((x/  -  yj)/2)  for  /  =  1, . . .,  using  FMM. 
do  /  =  l,N 

ai  =  at-i-  53^1  Oj 
end  do 
do  /=  l,iV 
at  =  at  •  di 
end  do 

Total  0(iV  •  log  TV  +  •  log(^)) 


3.2.4  FFTs  for  Complex  Data  Points 

Various  generalizations  of  the  problems  addressed  in  this  chapter  are  mentioned  briefly 
in  Section  6.  One  of  the  generalizations  of  Problems  3. 1-3.4  merits  special  attention, 
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and  is  discussed  in  this  section:  this  is  the  case  when  the  points  {ij}  are  complex, 
cind  lie  slightly  off  the  real  axis. 

We  are  interested  here  in  the  transformations  described  by  the  formulae 

N/2-1 

f,  =  (3.126) 

k=-N/2 

for  j  =  1, . . . ,  iV,  which  is  a  generalization  of  (3.1)  with 

Xj  =  rj  +  isj.  (3.127) 

Algorithms  3. 1-3.4  can  be  modified  to  evaluate  expressions  of  the  form  (3.126),  pro¬ 
vided  that  the  Sj  are  small  (on  the  order  of  -L). 

Problems  of  this  type  are  frequently  encountered  in  signal  analysis,  computational 
complex  analysis  and  several  other  areas. 


Chapter  4 

Nonequispaced  FFTs:  An 
Alternative  Approach 


In  this  chapter,  we  present  another  set  of  algorithms  for  evaluating  series  of  the  form 
(1.1).  This  approach  uses  an  interpolation  scheme  based  on  the  Fourier  analysis  of 
Gaussian  bells  in  place  of  the  FMM-based  interpolation  scheme  of  Chapter  3.  The 
two  approaches  are  compared  in  Chapter  6. 

For  the  remainder  of  the  chapter  we  will  operate  under  the  following  assumptions: 

1.  a;  =  {wo, . . . ,  Wat}  and  x  —  {a:o>  •  •  • ,  a:// }  are  finite  sequences  of  real  numbers. 

2.  uk  €  [- A/2,  A/2]  for  Jt  =  0, . . . ,  A. 

3.  Xj  €  [— TT,  tt]  for  j  =  0, . . . ,  A. 

4.  a  =  {oo,. .  ■  ,aAr},  /  =  {f~N/2,  ■  ■  ■  ■>  fN/2}^  •  ■■■,^N/2},  9  =  {po,  ■  •  ■,9n}, 

7  =  {70,  •  •  •  ?  7v}  and  h  =  {ho, . . . ,  hi\f}  are  finite  sequences  of  complex  numbers. 

We  will  consider  the  problems  of  applying  the  Fourier  matrix  and  its  transpose,  i.e. 
we  are  interested  in  the  transformations  F,  G  :  defined  by  the  formulae 

A  =  f  («)j  =  E  (4.1) 

k=0 

for  j  =  —  A72, . . . ,  A/2,  and 

N/2 

»  =  (?(%=  E  (4.2) 

k=-N/2 


for  j  =  0, . . . ,  A. 
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Remark  4.1  If  Xk  =  — Wfc  •  2ir/iV  for  fc  =  0,..  .,iV,  then  (4.1)  can  be  rewritten  as 

f,  =  '£a,e-=’‘,  (4.3) 

k=0 


or  alternatively  as  F  =  G’. 

We  will  also  consider  the  more  general  transformation  H  :  —*  defined 

by  the  formula 

=  (4.4) 

k=0 

More  formally,  we  consider  the  following  problems 


•  Problem  4.1:  Given  a,  find  /  =  F{a). 

•  Problem  4.2:  Given  /3,  find  g  =  G{0). 

•  Problem  4.3:  Given  7,  find  h  =  H{-y). 

The  plan  of  this  chapter  is  as  follows.  We  start  in  Section  4.1  with  some  results 

from  analysis  and  approximation  theory  which  are  used  in  the  design  of  the  algo¬ 
rithms.  This  is  followed  by  an  informal  description  of  the  algorithms  in  Section  4.2. 
In  Section  4.3  we  introduce  some  notation  which  is  used  in  a  set  of  more  detailed 
algorithm  descriptions  in  Section  4.4. 


4.1  Mathematical  and  Numerical  Preliminaries 


4.1.1  Elementary  Analytical  Tools 

In  this  subsection  we  summarize  some  well-known  results  to  be  used  in  the  remainder 
of  this  chapter.  Lemmas  4.1  and  4.2  are  obvious,  and  Lemmas  4.3  and  4.4  can  be 
found,  for  example,  in  [14]. 


Lemma  4.1  For  any  real  c, 


2 

-  sin(c7r). 


Lemma  4.2  For  any  integer  k, 


e'^^dx  = 


1 


1 

0 


if  k  =  0 
otherwise 


(4.5) 


(4.6) 


4.1.2  Relevant  Facts  from  Approximation  Theory 

The  principal  tool  of  this  chapter  is  a  somewhat  detailed  analysis  of  Fourier  series  of 
functions  (f> :  [— tt,  it]  —*  C  given  by  the  formula 

<(>ix)  =  •  e'“  (4.10) 

where  b  >  ~  and  c  are  real  numbers.  We  present  this  analysis  in  the  lemmas  and 
theorems  of  this  subsection,  numbered  4.5-4.10. 

Lemmas  4.5  and  4.6  provide  two  inequailities  which  axe  used  in  Theorem  4.7, 
and  Theorems  4. 7-4.9  are  intermediate  results  leading  to  Theorem  4.10.  This  final 
theorem  explains  how  to  approximate  functions  of  the  form  using  a  small  number 
of  terms,  and  the  algorithms  of  this  chapter  are  beised  upon  this  result.  We  derive  error 
bounds  for  all  approximations  which  allow  us  to  perform  numerical  computations  to 
any  specified  accuracy. 

Lemma  4.5  For  any  real  b>  ^,c  and  any  integer  k, 

|2  J°°  cos((c  -  k)x)dx  +  e'^’^'^^^dx  <  2we''^^  '  (l  +  ^)  •  (4.11) 

Proof.  Using  the  triangle  inequality  and  Lemma  4.4  we  have 

|2  r  •  cos((c  -  k)x)dx  + 

I  J'K  J  —  TT 

<  2  +  2xe-*"'  <  27re-^''  •  +  1^  (4.12) 
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Lemma  4.6  For  any  real  b>  and  any  integer  k, 


^2  J  e  ***  cos((c  —  k)x)dx  +  e  ^ 

Proof.  Integrating  by  parts  we  have 

2  f  •  cos((c  —  k)x)dx 

Jir 


r  e'^'^-'^^^dx 

J 


(l  +  1)  .  (4.13) 


(c  -  ky 


=  — e  sin((c  —  A:)i)l  H - r  /  xe"**^*  sin((c  —  k)x)dx  (4.14) 

c  —  k  ^  c  —  k  Jtt 


c  —  k 


e  sin 


46  /■' 

mi(c  -  k)ir)  + 


xe  sin((c  —  fc)x)<fx. 


After  rearranging  the  terms  in  (4.14)  and  integrating  by  parts  again  we  obtain 
/•<»  2 

2  1  e  “  cos((c  —  k)x)dx  H - —  sin((c  —  k)ir) 

Jit  c  —  k 

4b  r°°  I  2 

=  - -  I  xe~  *  sin((c  —  k)x)dx 

C  ““  K  J  If 

=  ^^ky  cos((c  —  fc)x)|  —  j  (1  —  26x^)e“*’**  cos((c  —  fc)x)c(3t^5) 


< 

46 

fe-*-’ 

(c  -  ky 

V 

Jfr 

< 

46 

(c  -  ky 

V 

/, 

Finally,  due  to  (4.15)  and  Lemmeis  4.1  and  4.4  we  have 

|2  r  cos((c  -  fc)x)di  +  •  T  < 

I  J^jr  J—TT 


dx 


4be-^^^  /_  2 


[c-ky 

86x6-^^' 

{c-ky 


(l  4)4.16) 


The  following  theorem  provides  an  explicit  expression  for  the  coefficients  of  a 
Fourier  series  which  approximates  functions  of  the  form  (4.10). 

Theorem  4.7  Let  4>{x)  =  for  any  real  b>  |,c.  Then,  for  any  x  €  (— 7r,7r), 


(4.17) 
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where 


-(c-k)^/4b 


(4.18) 


for  k  =  — oo, . . . ,  oo. 

Proof.  We  denote  by  Ck  the  fc-th  Fourier  coefficient  for  <f>,  so  that  for  x  €  (— ir,ir), 


4.{x)=  Y. 

jb=— OO 

and  due  to  Lemma  4.3  and  equation  (4.18)  we  have 

1  r  , .  .  . 


(4.19) 


27r  \*/— oo  0 


^  (\/f  ■  +  f  e“'’^'-“+‘^dz  -  e-‘*'+*^-‘*^dz)  (4.20) 

=  Pk~  —  f  cos((c  —  fc)i)dz. 


Rearranging  equation  (4.20)  and  applying  Lemmas  4.5  and  4.6  we  obtain  the  inequal¬ 
ities 


<^k  —  Pk  — 

<^k  —  Pk  — 


-67r2 

i“7r 

c 

/  e‘“e-’'=*dz 

27r  . 

l-r 

fir 

c 

/  e'“e-*^dz 

27r  . 

'  — TT 

4be-^ 
(c  -  ky 


04). 

4'4). 


(4.21) 

(4.22) 


and  it  now  follows  from  the  combination  of  (4.19),  (4.21)  and  (4.22)  that,  for  any 

X  €  (-7r,7r), 


E  -  e-'"' ■  e*' 

fc=— oo 

—  ^  ( rr  n  ^ 

—  ^  •  1  CTfc  —  Pfc - ; 


fc,|c— <:|>3 


„  4ie-*'’  /,  1  \  _ 


,  Q  °°  1 

<  4i«-‘'  .-.2.2:-+6= 


k,\c~k]<3 

10 


04) 


(4.23) 
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Some  elementary  an2iysis  yields 


V—  ^  f°°  ^  -  ^ 


(4.24) 


and  substituting  (4.24)  into  (4.23)  we  have 

-  E  •  e‘“  <  e-'"*  •  (46  +  .  (4.25) 

ik— —00 

To  complete  the  proof  we  make  use  of  the  triangle  inequality  and  (4.25)  to  obtain 
4>{x)  -  E  PJte***  <  <t,{x)  -  f;  Pfce'"'"  -  •  e‘“| 

k=—QO  k=—oo 

<  e-'^  ■  (46  +  .  (4.26) 

□ 


Remark  4.2  According  to  Theorem  4.7,  functions  of  the  form  can  be  ap¬ 

proximated  by  a  Fourier  series  whose  coefficients  are  given  analyticaJly,  and  the  error 
of  the  approximation  decreases  exponentially  as  b  increases. 

Remark  4.3  The  coefficients  pk  as  defined  by  (4.18)  have  a  peak  at  k  —  [c],  the 
nearest  integer  to  c,  and  decay  exponenti2JIy  as  ifc  —>  ±00.  We  keep  only  the  q  1 
largest  coefficients,  where  the  integer  q  is  chosen  such  that 

q  >  4bir,  (4.27) 


so  as  to  satisfy  the  inequality 

g-(9/2)V4J> 


(4.28) 


The  following  theorem  estimates  the  truncation  error  under  the  conditions  of 
Remark  4.3  and  thus  provides  a  way  of  approximating  functions  of  the  form  (4.10) 
by  a  (^  -h  l)-term  series. 

Theorem  4.8  Let  <f>{x)  =  for  any  real  b  >  ic,  and  let  q  be  an  even  integer 

such  that  q  >  Abr.  Then,  for  any  x  €  (— tt,  tt), 

Id+9/2 

E  -(46-^9),  (4.29) 

*=[c)-9/2 

where  are  defined  by  (4- 18). 
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Proof.  For  any  x  €  (— 7r,7r), 


lc]+,/2 

*=[cl-«/2 


< 

+ 

E 

+ 

E 

k^—oo 

t>[c]+9/2 

*<(c]-9/2 

Due  to  (4.18)  and  the  triangle  inequality  we  have  the  inequalities 

oo  --(c-k)^/Ab  OO  --k^/4b 

E  <  E  ^ <  E  ^-7=-> 

fc>[c]+9/2  k=lc]+g/2+l  2\/6ir 

I  I  (c1~9/2— 1  (c-t)^/4fc  oo  .—k^/4b 


E  <  E 

fc<[cl-9/2  fc=-c 


2\/fir 


Some  elementary  analysis  and  an  application  of  Lemma  4.4  yields 


and  it  follows  from  the  combination  of  (4.27),  (4.28)  and  (4.33)  that 


(-^)- 


(4.30) 


(4.31) 

(4.32) 


V  +  r  .(2  +  2^)  ,  (4.33) 

fc=a/2  V  29/2/ 


(4.34) 


Substituting  (4.34)  into  (4.31)  and  (4.32)  we  have 

E  <  2^  .  (1  +  i) 

t>M+?/2  it<rrcl-o/2  \  7r/ 


<  e-*-’ .  22 
9’ 


and  finally,  substituting  (4.26)  and  (4.35)  into  (4.30),  we  obtain 
1  W+9/2  I  _  ,  70  in\ 


•  X  1  ri  V 

fli(x)  -  E  •  (4^  +  -^  + 

fc=[c]-,/2  ^  y  y  / 


(4.35) 


(4.36) 

□ 


The  following  corollary  describes  a  formula  for  approximating  e’“  using  a  series 
of  q  +  1  terms. 


70  CHAPTER  4.  NONEQUISPACED  EFTS:  AN  ALTERNATIVE  APPROACH 


Corollary  4.9  Suppose  that  m  >  2  is  an  integer  and  that  the  conditions  of  Theo¬ 
rem  4-8  are  satisfied.  Then,  multiplying  both  sides  of  (4-29)  by  we  obtain 

<  e-*'' -(46  +  9) 

<  •  e-**"'  •  (46  +  9)  (4.37) 


—  e 


bx^ 


[c]+g/2 
k=[c]~q/2 


ikx 


for  anyxei-^,^]. 


Finally,  Theorem  4.10  malces  use  of  a  simple  linear  scaling  to  generalize  the  in¬ 
equality  (4.37)  from  [— ^,^]  to  any  interval  [—d,d\.  This  is  the  principal  result  of 
the  section. 


Theorem  4.10  Let  b  >  ^,c,d  >  0  be  real  numbers,  and  let  m  >  2,q  '>  46x  be 
integers.  Then,  for  any  x  €  [—d,d\. 


^icx  _  g6(iir/m<i)2 


[cmd/it]+ql2 
/c=(cmd/w]— 9/2 


<  •  (46 -H  9)  (4.38) 


where  {p*}  are  defined  by  (4-18). 


Remark  4.4  The  error  bounds  obtained  in  the  above  theorems  are  rather  pes¬ 
simistic.  Numerical  estimates  for  the  actual  errors  can  be  found  in  Section  4.5. 


4.2  Informal  Descriptions  of  the  Algorithms 

In  this  section  we  give  informal  outlines  of  algorithms  for  Problems  4. 1-4.3  of  this 
chapter.  More  formal  descriptions  of  these  algorithms  are  presented  in  Section  4.4. 
The  algorithms  for  these  problems  are  based  on  the  following  principal  observation. 

Observation  4.5  According  to  Theorem  4-10,  any  function  of  the  form  e*“  can  be 
accurately  represented  on  any  finite  interval  on  the  real  line  using  a  small  number  of 
terms  of  the  form  and  this  number  of  terms,  q,  is  independent  of  the  value 

of  c. 

The  FFT  algorithm  applies  the  Fourier  matrix  to  arbitrary  complex  vectors  in 
0{N  log  N)  operations  when  {uk}  are  integers  and  {zj}  are  equally  spaced  in  [— tt,  tt]. 
For  the  eflBcient  application  of  the  transformations  described  by  (4.1),  (4.2)  and  (4.4), 
we  relate  these  more  general  cases  to  the  equispaced  case  of  the  FFT.  Observation  4.5 
is  used  in  two  ways  to  achieve  this: 
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•  To  approximate  each  e******  in  terms  of  a  q-term  Fourier  series. 

•  To  approximate  the  value  of  a  Fourier  series  at  each  Zj  in  terms  of  vailues  at  the 
nearest  q  equispaced  nodes. 

This  interpolation  between  equispaced  and  nonequispaced  sets  of  points  can  thus 
be  performed  in  0{Nq)  operations. 

Observation  4.6  The  overall  complexity  of  each  such  algorithm  which  couples  the 
FFT  with  the  interpolation  scheme  will  be  0{N  log  N  +  Nq)  operations. 


4.3  Notation 

In  this  section  we  introduce  the  notation  to  be  used  in  the  next  section  for  the  detailed 
algorithm  descriptions. 

For  an  integer  m  >  2  and  a  real  number  fe  >  0,  we  will  define  a  real  number  £  >  0 
by 

£  =  +  (4.39) 

and  we  will  denote  by  q  the  smallest  even  natural  number  such  that 


q  >  46;r. 


(4.40) 


For  an  integer  m  and  a  set  of  real  numbers  we  will  denote  by  pk  the  nearest 
integer  to  muk  for  A:  =  0, . . . ,  A^,  and  by  {Pjk}  a  set  of  real  numbers  defined  by  the 
formula 

Pjk  =  — ^  (4.41) 

2\/bir 

for  k  =  0,...,N  and  j  =  — q/2, . . . ,  q/2. 

Observation  4.7  Setting  d  =  ir  in  Theorem  4-10  we  see  that 


for  any  k  =  0, . . .  ,N  and  any  x  €  [— where  e  is  defined  by  (4-39). 


(4.42) 


For  a  given  set  of  complex  numbers  {at},  we  will  denote  by  {tj}  the  unique  set 
of  complex  coefficients  such  that 


mN/2—l 


k=l  j=-g/2 


}=-mN/2 


(4.43) 
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so  that 

j,k,Hk+j=l 

We  will  denote  by  {Tj}  a  set  of  complex  numbers  defined  by  the  formula 

mN/2-l 

T]  =  £  T-fc  •  (4.45) 

k=-mN/2 

for  j  =  —mNI2, . . . ,  miV/2  —  1. 

Further,  we  will  denote  by  {/,}  another  set  of  complex  numbers  defined  by  the 
formula 

I  =  .  jr  (4.46) 

f.ri  =  -;V/2,...,iV/2. 

Observation  4.8  Combining  (4-4^)  ~  (4-46)  ^ith  the  triangle  inequality,  we  see  that 

l/j  -  Al  <  £  i;  (4.47) 

fc=0 

forj  =  —N/2,. . .  ,N/2,  where  {/,  =  F(Q!)j}  are  defined  by  (4-1)- 

For  an  integer  m  and  a  set  of  real  numbers  {xj}  we  will  denote  by  Vj  the  nearest 
integer  to  XjmN/2Tr  for  j  =  0, . . . ,  N,  and  by  {Qjk}  a  set  of  real  numbers  defined  by 
the  formula 

Qjk  =  •  e“<*-'’”^/27r-(..,+fc))2/46  (4_4g^ 

2y/bK 

for  j  =  0, . . . ,  iV  and  k  =  —9/2, . . .  ,9/2. 

Observation  4.9  Setting  d  =  N/2  in  Theorem  4-10  we  see  that 

<1/2 

e**®j  ..  ^b(2-irk/mNf  ^  ^i(uj+l)2Tk/mN 

l=-q/2 

for  any  j  =  0,. . .  ,N  and  any  k  G  [—N/2,  N/2],  where  e  is  defined  by  (4-39). 

For  a  given  set  of  complex  numbers  {/?*},  we  will  denote  by  {ufc}  a  set  of  complex 
numbers  defined  by  the  formula 


<  £  (4.49) 


Uk  =  0k- 


(4.50) 
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for  k  =  —  A^/2, . . . ,  N/2,  and  by  {Ui}  a  set  of  complex  numbers  defined  by  the  formula 


^1=  ‘^k- 


2‘iftkllmN 


(4.51) 


k=-N/2 


for  /  =  —mN/2, ... ,  mA^/2  —  1. 

Further,  we  will  denote  by  another  set  of  complex  numbers  defined  by  the 
formula 

9/2 

9:=  E  (4-52) 

'=-9/2 

for  j  =  0, . . . ,  N. 

Observation  4.10  Combining  {4.4^)  ~  (4-52)  with  the  triangle  inequality,  we  see 
that 


\9:  <  £  •  E  1^*1 

AsO 

Jor  j  =  0, . . . ,  N ,  where  {gj  =  G{$)j}  are  defined  by  (4-2). 


(4.53) 


For  a  set  of  real  numbers  {ij}  we  will  denote  by  rjj  the  nearest  integer  to  XjN/2i: 
for  j  =  0, . . . ,  N,  and  by  {Rjk}  a  set  of  real  numbers  defined  by  the  formula 


R  ,  —  .  ^-(^jAf/2’r-(9;+A:))*/44 

for  j  =  Q,. . .  ,N  and  k  =  —q/2, . . .  ,9/2. 

Observation  4.11  Setting  d  =  N/2  in  Theorem  4-10  we  see  that 


(4.54) 


^ikxj/m  _  ^b{2-rk/mNf  .  ^  .  ^t{Vj+l)^irk/mN  ^  ^ 

l=-ql2 

for  any  j  =  0, . . . ,  N  and  any  k  €  [—N/2,  N/2]  where  e  is  defined  by  (4.39). 


(4.55) 


For  a  given  set  of  complex  numbers  {7*},  we  will  denote  by  {uj}  the  unique  set 
of  complex  coefficients  such  that 


so  that 


N  9/2 

E  ')'*  •  E 

k=0  j=-9/2 


j  =  -mNf2 


'^1=  E  P]k. 

J,k,Vk+}=l 


(4.56) 


(4.57) 
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We  denote  by  {V/}  a  set  of  complex  numbers  defined  by  the  formula 

mN/2 

Vi=  Vk-  (4.58) 

k=—mN/2 

for  /  =  -m^Nl2, . . . ,  m^N/2  -  l._ 

Further,  we  will  denote  by  {hj}  another  set  of  complex  numbers  defined  by  the 
formula 

912 

=  (4.59) 

l=-q/2 

for  7  =  0, ... ,  N. 

Observation  4.12  Combining  (4-4^)  o.nd  (4-55)  -  (4-59)  with  the  triangle  inequality, 
we  see  that  ^ 

|A, (4.60) 

k=0 

for  j  =  0, . . . ,  iV,  where  {hj  =  H{‘y)j]  are  defined  by  {4-4) y 

8  =  ■  (46  +  9).  (4.61 ) 

4.4  Detailed  Descriptions  of  the  Algorithms 

This  section  contains  descriptions  of  algorithms  for  Problems  4. 1-4.3  of  this  chapter. 
In  the  tables  below  we  will  make  use  of  the  facts  that  q  ~  log(j)  and  •C  N. 

Remark  4.13  Results  of  our  extensive  numerical  experiments  indicate  that  m  =  2  is 
an  optimal  choice  for  both  efficiency  and  accuracy.  Each  of  Algorithms  4. 1-4.3  then 
requires 

O  ■  log  TV  +  •  log  )  (4.62) 

arithmetic  operations. 

Remark  4.14  The  storage  requirements  of  an  algorithm  are  also  an  important  char¬ 
acteristic.  From  the  descriptions  of  the  initialization  steps  the  asymptotic  storage 
requirements  for  each  of  Algorithms  4. 1-4.3  are  of  the  form 

\  N  q  (4.63) 

where  the  coefficient  A  is  software-  and  hardwrare-dependent. 
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Algorithm  4.1 

Step  Complexity  Description 

Init  0{Nq)  Comment  [Input  parameter  is  the  vector  {u>o, . . 

Choose  precision  £  to  be  achieved,  set  b  w  log(l/£)  and  q  =  [46x]. 
do  A:  =  0,  iV 

Determine  Hk,  the  nearest  integer  to  muk 
do  j  =  -q/2,ql2 

Compute  Pjk  according  to  (4.41) 
end  do 
end  do 

do  j  =  -N/2,N/2 
Compute 

end  do 

1  0(Nq)  Comment  [Input  parameter  is  the  vector  {oo, . . . ,  Ojv}-] 

Comment  [Compute  Fourier  coefficients  Tj.] 

do  k  =  0,N 

do  j  =  -ql2,ql2 

^  +  Rjk  •  Oik 

end  do 
end  do 

2  O(miVlogiV)  Comment  [Evaluate  this  Fourier  Series  at  equispaced  points  in 

[— 77ix,mx]  using  inverse  FFT  of  she  mN .] 

Compute  Tj  =  for  j  =  -mN/2, ...,  mN/2  -  1. 

3  0{N)  Comment  [Scale  the  values  at  those  points  which  lie  in  [— x,x].] 

do  ;•  =  -NI2,Nf2 

J.  —  ,  g6(27rj7m.V)^ 

end  do 


Total  0(iV  •log(j)  +  mA^ -log A) 
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Algorithm  4.2 

Step  Complexity  Description 

Init  0{Nq)  Comment  [Input  parameter  is  the  vector  {xq,  . . . , 

Choose  precision  £  to  be  achieved,  set  b  a  log(l/£)  and  q  =  [465r]. 
do  j  =  0,N 

Determine  i/j,  the  nearest  integer  to  XjmN  12'k 
do  k  =  —qf2^qf2 

Compute  Qjk  according  to  (4.48) 
end  do 
end  do 

do  it  =  -Nl2,Nf2 
Compute 
end  do 

1  0{N)  Comment  [Input  parameter  is  the  complex  vector 

{P-N/2^-  ■  ■■,0N/2}-] 

Comment  [Compute  new,  scaled  Fourier  coefficients.] 

do  k  =  -N/2,N/2 
Uk  =  I3k- 

end  do 

2  0{mN  log  N)  Comment  [Evaluate  this  Fourier  Series  at  equispaced  points  in 

[— ;r,7r]  using  inverse  FFT  of  size  mN.] 

Compute  Uj  =  Y^kl^N/2  for  j  =  —mN/2, . . . ,  mN/2  -  1. 

3  0{Nq)  Comment  [Compute  approximate  values  at  desired  points  in  terms 

of  the  values  at  equispaced  points.] 

do  j  =  0,  iV 

do  k  =  -9/2,  g/2 

9]  9 3  +  Qjk  'Uvj+k 

end  do 
end  do 


Total  0(miV -log iV  +  iV  •log(i)) 


4.4.  DETAILED  DESCRIPTIONS  OF  THE  ALGORITHMS 


77 


Description 

Comment  [Input  parameters  are  the  vectors  {wq,  . .  and 

{xOi  •  •  •  7  ^n}-] 

Choose  precision  e  to  be  achieved,  set  6  w  log(l/£)  and  q  =  f46ir]. 
do  A:  =  0,  iV 

Determine  /it,  the  nearest  integer  to  muk 
do  j  =  -q/2,q/2 

Compute  Pjk  according  to  (4.41) 

end  do 
end  do 

do  A:  =  -TnN/2,TnNI2 
Compute  e'>(2’rfc/m2N)2 

end  do 

do  j  =  Q,N 

Determine  t}j,  the  nearest  integer  to  XjN/2n 
do  A:  =  -qf2,qf2 

Compute  Rjk  according  to  (4.54) 
end  do 
end  do 
do  j  =  0,N 
Compute 
end  do 

1  0{Nq)  Comment  [Input  parameter  is  the  vector  {70?  •  •  •»7n}-] 

Comment  [Compute  Fourier  coefficients  Vj.] 
do  A;  =  0,iV 

do  j  =  -q/2,q/2 

^l^k+j  ^n-k+i  "f"  ■  Ik 

end  do 
end  do 

2  0{mN)  Comment  [Scale  the  coefficients.] 

do  Ar  =  —mN f2,mN/2 

Vk  ^  Vk- 

end  do 

3  0{Tn^N  log  N)  Comment  [Evaluate  this  Fourier  Series  at  equispaced  points  in 

[— m;r,7n7r]  using  inverse  FFT  of  size  m^N.] 

Compute  Vj  =  Y!k=imNl2  for  j  =  -rn^N/2, .  ..,m^NI2  -  1. 


Algorithm  4.3 
Step  Complexity 

Init  0{Nq) 
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4  0{Nq) 


5  OiN) 


Comment  [Compute  approximate  values  at  desired  points  in 
terms  of  the  values  at  equispaced  points.] 
do  j  =  Q,N 

do  k  =  -q/2,q/2 
hj  hji  +  Rjk  ■ 
end  do 
end  do 

Comment  (Scale  the  values.] 
do  j  =  Q,N 

hj  ^  hj  ■ 

end  do 


Total  0{m'^N  ■  log  TV  +  iV  •  log(i)) 


4.5  Numerical  Estimates  of  Error  Bounds 

In  this  section  we  present  numerical  estimates  of  error  bounds  for  Theorem  4.10  of 
Section  4.1.2.  For  our  experiments  we  chose  c  =  0  and  d  =  x,  and  chose  two  different 
sets  of  values  of  m,  h  and  q.  The  expression 


2  9/2 

r(i)  =  1  -  ^  pk-  (4.64) 

k=-ql2 

where  pk  is  defined  by  (4.18),  was  evaluated  at  n  =  1000  equally  spaced  nodes  {x^} 
in  the  interval  [—^5^],  and  the  following  three  quantities  were  computed: 

•  the  maximum  absolute  error  Eoo,  defined  by  the  formula 

Foo  =  naax  |r(x/t)|,  (4.65) 

l<k<n 

•  the  relative  L2  error  E2,  defined  by  the  formula 

ELi  \r{xkW 

n 

•  the  error  bound  Eb  of  Theorem  4.10,  defined  by  the  formula 

Fb  =  e-^''('-‘/’"')-(46  +  9). 


(4.66) 

(4.67) 
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m 

b 

9 

F 

■C/QO 

E2 

Eg 

2 

0.5993 

10 

0.825  E-05 

0.176  E-05 

0.135  E-00 

2 

1.5629 

28 

0.400  E-13 

0.580  E-14 

0.163  E-03 

Table  4.1;  Error  Bounds  for  Theorem  4.10. 

The  results  of  this  experiment  are  presented  in  Table  4.1. 

We  observe  from  Table  4.1  that  the  theoretical  error  bound  Eg  of  Theorem  4.10 
is  very  weak  compared  with  the  experimentally  obtained  bound.  Indeed,  the  require¬ 
ment  that  Eg  be  appropriately  small  would  impose  much  larger  values  of  b  and  q 
than  are  actually  needed  for  Algorithms  4. 1-4.3. 
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Chapter  5 


Implementation  and  Numerical 
Results 


We  have  written  FORTRAN  implementations  of  the  algorithms  of  this  thesis  us¬ 
ing  double  precision  arithmetic,  and  have  applied  these  programs  to  a  variety  of 
situations.  In  this  chapter  we  will  restrict  our  attention  to  the  implementations  of 
Algorithms  3. 1-3.4  and  Algorithms  4. 1-4.3,  and  will  demonstrate  the  performance  of 
these  algorithms  with  numerical  examples. 

Several  technical  details  of  our  implementations  appear  to  be  worth  mentioning 
here: 

1.  Each  implementation  consists  of  two  main  subroutines:  the  first  is  an  initiaJ- 
ization  stage  in  which  the  matrix  operators  of  the  algorithm  are  precomputed 
and  stored,  and  the  second  is  an  evaluation  stage  in  which  these  operators  are 
applied.  Successive  application  of  the  linear  transformations  to  multiple  vectors 
requires  the  initialization  to  be  performed  only  once. 

2.  The  parameters  for  each  algorithm  were  chosen  to  retain  mciximum  precision. 
For  Algorithms  4. 1-4.3  we  chose  m  =  2,  6  =  1.5629  and  q  =  28,  and  for  the 
FMM  algorithms  of  Chapter  2  used  by  Algorithms  3. 1-3. 4  we  chose  p  =  22, 
p  =  10,  s  =  16  and  nlevs  =  logjCn/s). 

3.  Algorithms  3. 1-3.4  and  4. 1-4.3  all  require  the  evaluation  of  sums  of  the  form 

N/2-l 

f,=  Y.  a*  (5.1) 

k=-N/2 

for  j  =  —N/2, ,  N/2  —  1,  whereas  most  FFT  software  computes  sums  of  the 
form 

/,  =  a,  •  (5.2) 

*=0 
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forj  =  0, . . . ,  N  —  1.  We  used  a  standard  FFT  to  evaluate  sums  of  the  form  (5.1) 
by  defining  a*  =  Ofc  for  A;  =  0, . . . ,  N/2  -  1,  ojt  =  ak-N  for  k  =  N/2, ...  ,N  -1, 
-  fj  for  j  =  0, . . . ,  N/2  -  1  and  /_,  =  fk-N  for  j  =  N/2, . . . ,  TV  -  1.  This 
substitution  converts  the  form  (5.1)  to  the  form  (5.2). 

4.  The  algorithms  of  this  thesis  which  require  an  FFT  of  size  proportional  to  N 
will  perform  efficiently  whenever  the  FFT  does.  This  restriction  on  problem 
size  can  be  removed  by  extending  the  input  vector  to  length  2 ^“*2^  (i.e.  the 
smallest  power  of  2  which  is  greater  than  N)  and  padding  it  with  zeroes.  This 
ensures  that  the  algorithms  will  perform  efficiently  for  any  choice  of  N.  In  our 
implementations,  these  changes  were  made. 

Our  implementations  of  the  algorithms  of  this  thesis  have  been  tested  on  the  the 
Sun  SPARCstation  1  for  a  variety  of  input  data.  Seven  experiments  are  described  in 
this  chapter,  and  their  results  are  summarized  in  Tables  5. 1-5.7.  These  tables  contain 
error  estimates  and  CPU  timings  for  the  algorithms,  with  all  computations  performed 
in  double  precision  arithmetic. 

The  table  entries  are  described  below. 

•  The  first  column  in  each  table  contains  the  problem  size  N. 

•  The  second  column  in  each  table  contains  the  relative  oo-norm  error  defined  by 
the  formula 

~  /iW<N 

where  the  vector  /  is  the  algorithm  output  and  the  vector  /  is  the  result  of  a 
direct  calculation. 


•  The  third  column  in  each  table  contains  the  relative  2-norm  error  defined  by 
the  formula 


where  the  vector  /  is  the  algorithm  output  and  the  vector  /  is  the  result  of  a 
direct  calculation. 


•  The  fourth  and  fifth  columns  in  each  table  contain  CPU  timings  for  the  initial¬ 
ization  and  evaluation  stages  of  the  algorithm. 

•  The  sixth  column  in  each  table  contains  CPU  timings  for  the  corresponding 
direct  calculation. 


•  The  last  column  in  each  table  contains  CPU  timings  for  an  FFT  of  the  same 
size. 

Remark  5.1  Our  implementations  of  the  direct  methods  for  Examples  1,  2,  4  and 
5  were  optimized  by  using  the  fact  that  to  reduce  the  number  of  com¬ 

plex  exponential  computations.  In  Example  3  however,  N'^  exponentials  are  required 
for  the  direct  method,  and  for  larger  N,  the  available  memory  on  the  machine  is 
insufficient  for  the  precomputation  and  storage  of  these  numbers.  The  direct  imple¬ 
mentation  we  used  for  this  problem  computes  each  exponential  when  it  is  needed. 

Remark  5.2  Standard  LINPACK  Gaussian  Elimination  subroutines  were  used  as 
the  direct  methods  for  comparing  timings  in  Examples  6  and  7.  Estimated  timings 
are  presented  for  larger  N,  where  this  computation  became  impractical. 

Following  are  the  descriptions  of  the  experiments,  and  the  tables  of  numerical 
results. 

Example  1. 

Here  we  consider  the  transformation  F  :  of  Problem  4.1,  defined  by 

the  formula 

=  (5.5) 

*=o 

for  j  =  —Nf2, . . . ,  N/2.  In  this  example,  {wq,  •  •  • ,  were  randomly  distributed  on 
the  interval  [~N/2,  N/2]  and  {qo,  . . .  ,0/^)  were  complex  numbers  randomly  chosen 
from  the  unit  square 

0  <  Re(2)  <  1,  0  <  Im(2)  <  1.  (5.6) 

The  results  of  applying  Algorithm  4.1  to  this  problem  are  presented  in  Table  5.1. 

Example  2. 

Here  we  consider  the  transformation  G  :  of  Problem  4.2,  defined  by 

the  formula 

N/2 

GW,=  E  (5-7) 

k--N/2 

for  j  =  0,. ..  ,N.  In  this  example,  {xq,  ...,Xiv}  were  randomly  distributed  on  the 
interval  [— tt,  tt]  and  {^_n/2?  •  •  •  •>  0N/2}  were  complex  numbers  randomly  chosen  from 
the  unit  square 

0  <  Re(2)  <  1,  0  <  Im(^)  £  1.  (5.8) 

The  results  of  applying  Algorithm  4.2  to  this  problem  are  presented  in  Table  5.2. 
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Example  3. 

Here  we  consider  the  transformation  H  :  of  Problem  4.3,  defined  by 

the  formula 

«(7),  =  (5-9) 

k=0 

for  j  =  0, . . . ,  N.  In  this  example,  {cjq,  .  -  -  were  randomly  distributed  on  the 
interval  [— iV/2,  N/2],  {loj  •  ■  • » were  randomly  distributed  on  the  interval  [— ir,  ir] 
and  {70, ...  ,7a/}  were  complex  numbers  randomly  chosen  from  the  unit  square 

0  <  Re(2)  <  1,  0  <  Im(z)  <  1.  (5.10) 

The  results  of  applying  Algorithm  4.3  to  this  problem  are  presented  in  Table  5.3. 

Example  4. 

Here  we  consider  the  transformation  F  :  of  Problem  3.1,  defined  by  the 

formula 

N/2-1 

F{a),=  (5.11) 

k=-N/2 

for  j  =  1, . . . ,  N.  In  this  example,  {ii, . . .  ,xn}  were  randomly  distributed  on  the  in¬ 
terval  [— TT,  7r]  and  {a_A//2,  •  •  • ,  aAr/2-1}  were  complex  numbers  randomly  chosen  from 
the  unit  square 

0  <  Re(2)  <  1,  0  <  Im(2)  <  1.  (5.12) 

The  results  of  applying  Algorithm  3.1  to  this  problem  are  presented  in  Table  5.4. 

Example  5. 

Here  we  consider  the  transformation  F^  :  of  Problem  3.2,  defined  by  the 

formula 

F’'(a),  =  ai  ■  e-’-'  (5.13) 

fc=l 

for  j  =  —Nf2, . . . ,  N/2  —  1.  In  this  example,  {xi, . . . ,  xa/}  were  randomly  distributed 
on  the  interval  [— 7r,7r]  and  {ai,...,aA/}  were  complex  numbers  randomly  chosen 
from  the  unit  square 

0  <  Re(z)  <  1,  0  <  Im(2)  <  1.  (5.14) 

The  results  of  applying  Algorithm  3.2  to  this  problem  are  presented  in  Table  5.5. 

Example  6. 

Here  we  consider  the  transformation  F~^  :  ^  of  Problem  3.3  where  F  is 
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defined  by  the  formula 


N/i-i 

Yi  ■ 


k=-N/2 


(5.15) 


for  j ’  =  1, . . . ,  iV.  In  this  example,  {xi, . . . ,  were  defined  by  the  formulae 


Xj 


—7r  +  2a-  • 


j  +  0.5  +  6j 
N 


(5.16) 


for  ji  =  1, . . . ,  N,  where  Sj  were  randomly  distributed  on  the  interval  [—0.1, 0.1].  In 
addition,  {a_7v/25  ■  ■  ■  ■,  ctN/i-i}  were  complex  numbers  randomly  chosen  from  the  unit 
square 

0  <  Re(2)  <  1,  0  <  Im(2)  <  1,  (5.17) 

and  the  numbers  {/i, . . .  ,//v}  were  computed  directly  in  double  precision  arithmetic 
according  to  the  formula  fj  =  F{a)j.  The  vector  /  v/as  then  used  as  input  for  Algo¬ 
rithm  3.3.  Results  of  this  experiment  are  presented  in  Table  5.6. 


Example  7. 

Here  we  consider  the  transformation  (F^)~^  :  of  Problem  3.4  where  is 

defined  by  the  formula 

=  (5.18) 

fe=i 

for  j  =  —NI2, . . . ,  iV/2— 1.  In  this  example,  (xj, . . . ,  xat}  were  defined  by  the  formulae 


Xj  =  — TT  -t-  27r 


j  +  0.5  -f  6j 
N 


(5.19) 


for  J  =  1, . . . ,  TV,  where  Sj  were  randomly  distributed  on  the  interval  [—0.1, 0.1].  In 
addition,  {qj,  . . . ,  }  were  complex  numbers  randomly  chosen  from  the  unit  square 


0  <  Re(2)  <  1,  0  <  Im(z)  <  1,  (5.20) 

and  the  numbers  {f-N/2,---,fN/2-i}  were  computed  directly  in  double  precision 
arithmetic  according  to  the  formula  fj  =  F^{a)j.  The  vector  /  was  then  used  as 
input  for  Algorithm  3.4.  Results  of  this  experiment  are  presented  in  Table  5.7. 


The  following  observations  can  be  made  from  Tables  5. 1-5.7,  and  are  in  agree¬ 
ment  with  results  of  our  more  extensive  experiments  for  this  particular  architecture, 
implementation  and  range  of  TV. 

1.  All  of  the  algorithms  permit  high  accuracy  to  be  attained,  and  the  observed 
errors  are  in  accordance  with  the  theoretically  obtained  error  bounds. 
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2.  As  expected,  the  CPU  timings  for  all  the  algorithms  grow  slightly  faster  than 
linearly  with  the  problem  size,  N. 

3.  The  timings  for  Algorithms  4.1  and  4.2  are  similaur,  which  is  to  be  expected 
since  Problem  4.2  is  the  adjoint  of  Problem  4.1.  Algorithm  4.3  is  about  twice  as 
costly,  which  is  in  agreement  with  the  fact  that  it  is  a  synthesis  of  Algorithms  4.1 
and  4.2. 

4.  The  timings  for  Algorithms  3. 1-3.4  are  similar,  which  is  to  be  expected  since 
these  four  algorithms  are  so  closely  related. 

5.  The  initialization  times  for  Algorithms  3.1  and  3.2  are  considerably  less  than 
those  for  Algorithms  3.3  and  3.4.  This  is  because  the  former  pair  does  not 
require  the  additional  computation  of  the  numbers  {cjt}  and  {djt}. 

6.  Algorithms  4.1  and  4.2  are  about  5  times  as  costly  as  an  FFT  of  the  same  size, 
and  for  Algorithms  3. 1-3.4,  this  ratio  is  about  15. 

7.  Break-even  points  for  Algorithms  4.1  ajid  4.2  with  the  direct  method  are  at 
N  =  128  if  the  initialization  time  is  included,  and  at  A  =  16  if  it  is  ignored. 
Algorithm  4.3  becomes  faster  than  the  direct  calculation  at  A  =  32,  even  when 
the  initialization  time  is  included. 

8.  Algorithms  3.1  and  3.2  can  compete  with  the  direct  method  at  about  A  = 
32  ignoring  initialization  time,  and  at  A  =  1024  including  the  initialization. 
Algorithms  3.3  and  3.4  are  cdways  dramatically  feister  than  the  direct  calculation 
(15000  times  faster  at  A  =  2048)  if  we  ignore  initialization  time,  and  break  even 
with  it  at  A  =  64  if  we  include  the  initialization. 

9.  The  initialization  stage  is  much  more  costly  than  the  evaluation  stage  for  all  of 
the  algorithms.  Implementing  the  algorithms  in  two  stages  thus  gives  consider¬ 
able  time  savings  whenever  the  same  linear  transformation  is  to  be  applied  to 
multiple  vectors. 
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N 

Errors 

Timings  (sec.) 

Eoo 

E2 

Alg.  Init. 

Alg.  Eval. 

Direct 

EFT 

64 

0.602  E-14 

0.638  E-14 

0.036 

0.008 

0.02 

0.001 

128 

0.356  E-14 

0.715  E-14 

0.075 

0.016 

0.08 

0.002 

256 

0.437  E-14 

0.946  E-14 

0.148 

0.034 

0.31 

0.005 

512 

0.519  E-14 

0.160  E-13 

0.297 

0.075 

1.20 

0.012 

1024 

0.518  E-14 

0.314  E-13 

0.600 

0.155 

4.76 

0.026 

2048 

0.755  E-i4 

0.631  E-13 

1.204 

0.322 

18.93 

0.059 

Table  5.1:  Example  1,  Numerical  Results  for  Algorithm  4.1. 


N 

Errors 

Timings  (sec.) 

Eqo 

E2 

Alg.  Init. 

Alg.  Eval. 

Direct 

EFT 

64 

0.249  E-14 

0.814  E-14 

0.038 

0.005 

0.02 

0.001 

128 

0.501  E-14 

0.746  E-14 

0.075 

0.012 

0.09 

0.002 

256 

0.418  E-14 

0.623  E-14 

0.148 

0.028 

0.33 

0.005 

512 

0.356  E-14 

0.831  E-14 

0.297 

0.060 

1.24 

0.012 

1024 

0.793  E-14 

0.192  E-13 

0.596 

0.126 

4.93 

0.026 

2048 

0.138  E-13 

0.405  E-13 

1.188 

0.264 

19.62 

0.059 

Table  5.2:  Example  2,  Numerical  Results  for  Algorithm  4.2. 


N 

Errors 

Timings  (sec.) 

Eoo 

E2 

Alg.  Init. 

Alg.  Eval. 

Direct 

EFT 

64 

0.166  E-13 

0.226  E-13 

0.074 

0.015 

0.20 

0.001 

128 

0.252  E-13 

0.216  E-13 

0.153 

0.034 

0.79 

0.002 

256 

0.318  E-13 

0.315  E-13 

0.302 

0.069 

3.18 

0.005 

512 

0.131  E-13 

0.289  E-13 

0.601 

0.146 

12.76 

0.012 

1024 

0.203  E-13 

0.425  E-13 

1.210 

0.297 

51.12 

0.026 

2048 

0.324  E-13 

0.801  E-13 

2.403 

0.643 

205.17 

0.059 

Table  5.3:  Example  3,  Numerical  Results  for  Algorithm  4.3. 
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N 

Errors 

Timings  (sec.) 

Eoo 

E2 

Alg.  Init. 

Alg.  Eval. 

Direct 

EFT 

128 

0.379  E-14 

0.704  E-14 

1.04 

0.030 

0.09 

0.002 

256 

0.398  E-14 

0.116  E-13 

2.03 

0.081 

0.33 

0.005 

512 

0.499  E-14 

0.195  E-13 

3.26 

0.171 

1.24 

0.012 

1024 

0.318  E-13 

0.625  E-13 

4.97 

0.408 

4.93 

0.026 

2048 

0.763  E-13 

0.204  E-12 

8.07 

0.822 

19.62 

0.059 

Table  5.4;  Example  4,  Numerical  Results  for  Algorithm  3.1. 


N 

Errors 

Timings  (sec.) 

Eoo 

E2 

Alg.  Init. 

Alg.  Eval. 

Direct 

FFT 

128 

0.206  E-14 

0.800  E-14 

1.03 

0.033 

0.08 

0.002 

256 

0.323  E-14 

0.136  E-13 

2.05 

0.081 

0.31 

0.005 

512 

0.153  E-13 

0.343  E-13 

3.21 

0.174 

1.20 

0.012 

1024 

0.180  E-13 

0.654  E-13 

5.11 

0.409 

4.76 

0.026 

2048 

0.470  E-13 

0.221  E-12 

8.16 

0.823 

18.93 

0.059 

Table  5.5:  Example  5,  Numerical  Results  for  Algorithm  3.2. 
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N 

Errors 

Timings 

(sec.) 

X 

E 

2 

Alg.  Init. 

Alg.  Eval. 

Direct 

EFT 

128 

0.117 

E-13 

0.800 

E-14 

1.28 

0.034 

2.96 

0.002 

256 

0.196 

E-13 

0.137 

E-13 

2.51 

0.082 

23.6 

0.005 

512 

0.344 

E-13 

0.230 

E-13 

4.33 

0.175 

189 

0.012 

1024 

0.107 

E-12 

0.757 

E-13 

7.45 

0.409 

1512  (est.) 

0.026 

2048 

0.357 

E-12 

0.247 

E-12 

12.97 

0.819 

12096  (est.) 

0.059 

Table  5.6:  Example  6,  Numerical  Results  for  Algorithm  3.3. 


N 

Errors 

Timings 

(sec.) 

E, 

X 

E 

2 

Alg.  Init. 

Alg.  Eval. 

Direct 

FFT 

128 

0.134 

E-13 

0.806 

E-14 

1.26 

0.033 

2.96 

0.002 

256 

0.511 

E-13 

0.179 

E-13 

2.47 

0.080 

23.6 

0.005 

512 

0.870 

E-13 

0.373 

E-13 

4.24 

0.173 

189 

0.012 

1024 

0.178 

E-12 

0.811 

E-13 

7.29 

0.407 

1512  (est.) 

0.026 

2048 

0.942 

E-12 

0.369 

E-12 

12.80 

0.820 

12096  (est.) 

0.059 

Table  5.7:  Example  7,  Numerical  Results  for  Algorithm  3.4. 
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Chapter  6 

Conclusions  and  Generalizations 


The  principal  results  of  this  thesis  are  two  groups  of  efficient  algorithms  for  computing 
FFTs  for  nonequispaced  data  to  any  specified  precision.  Similarities  and  differences 
between  the  two  approaches  are  summarized  below. 

1.  Both  sets  of  algorithms  use  a  standard  FFT. 

2.  Both  sets  of  algorithms  use  interpolation  formulae  to  transform  function  values 
from  equispaced  to  nonequispaced  points  and  vice-versa.  Algorithms  3. 1-3.4 
use  an  interpolation  scheme  based  on  the  FMM,  while  Algorithms  4. 1-4.3  use 
an  interpolation  scheme  based  on  the  Fourier  analysis  of  Gaussian  bells. 

3.  For  the  application  of  the  linear  tranformations  being  considered,  the  algorithms 
of  Chapter  4  are  more  efficient  than  the  algorithms  of  Chapter  3. 

4.  For  the  inversion  of  these  linear  transformations,  the  schemes  of  Chapter  4  can 
be  applied  iteratively,  however  the  direct  approach  of  Chapter  3  is  generally 
more  efficient. 

5.  Algorithms  3. 1-3.4  comprise  a  set  of  closely  related  forward  and  inverse  algo¬ 
rithms  which  can  be  generalized  to  complex  data  points. 

In  conclusion,  two  groups  of  algorithms  have  been  presented  for  the  rapid  applica¬ 
tion  and  inversion  of  matrices  of  the  Fourier  kernel.  These  problems  can  be  viewed  as 
generalizations  of  the  discrete  Fourier  transform,  and  the  algorithms,  while  making 
use  of  certain  simple  results  from  analysis,  are  very  versatile,  and  have  a  broad  range 
of  applications  in  many  branches  of  mathematics,  science  and  engineering.  Severed 
related  algorithms  have  also  been  presented  which  utilize  the  analytical  tools  of  this 
thesis.  These  include: 

1.  an  efficient  version  of  the  Fast  Multipole  Method  in  one  dimension. 
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2.  a  fast  algorithm  for  polynomial  interpolation  on  the  real  line, 

3.  fast  algorithms  for  spectral  integration  and  differentiation  of  functions  tabulated 
at  nodes  other  than  Chebyshev,  and, 

4.  FFT  for  complex  data  points. 

The  results  of  this  thesis  are  currently  being  applied  to  problems  in  a  diversity  of  areas. 
Examples  include  problems  in  the  numerical  solution  of  parabolic  partial  differential 
equations,  the  analysis  of  seismic  data,  the  modelling  of  semiconductors,  weather 
prediction  and  the  numerical  simulation  of  fluid  behavior. 

Several  straightforward  generalizations  of  the  results  of  this  thesis  are  discussed 
below. 

1.  Problems  3.1,  3.2,  4.1,  4.2  and  4.3  all  involve  the  evaluation  of  an  A'^-term  series 
at  N  points.  Minor  modifications  to  Algorithms  3.1,  3.2,  4.1,  4.2  and  4.3  will 
allow  the  efficient  evaluation  of  these  .,V-term  series  at  M  points,  where  M  N. 
These  changes  have  been  implemented. 

2.  The  algorithms  of  this  thesis  also  assume  that  wt  €  [— iV/2,Ar/2]  and  Xj  € 
[— tTjTt].  Other  distributions  of  u  and  i  can  be  handled  by  appropriately  parti¬ 
tioning  these  vectors,  treating  each  partition  separately  and  finally  combining 
the  results.  The  following  observation  describes  translation  operators  which  can 
be  used  for  each  partition,  in  combination  with  one  of  Algorithms  4. 1-4.3,  or 
with  one  of  Algorithms  3. 1-3.2. 


Observation  6.1  Let  a,b  >  Q,c,d  >  0  be  a  set  of  real  numbers  and  suppose 
that  u3k  €  [a  —  6,  a  -|-  6]  for  fc  =  0, . . . ,  AT  and  Xj  6  [c  —  d,c  +  <I\  for  j  =  0, . . . ,  M. 
Then  we  can  write 


where 


N 

I 

k=0 


N 


k=0 


k=0 


OCl 


^k 


ak  • 

(wfc  -  a)d/7r, 

{xj  -  c)7r/d  €  [-7r,7r]. 


(6.1) 


(6.2) 


Remark  6.2  An  algorithm  of  this  type  will  perform  efficiently  when  the  points 
within  a  partition  are  close  together  and  there  are  very  few  partitions,  and  not 
so  efficiently  if  the  points  are  widely  separated  and  there  are  many  partitions. 

3.  The  algorithms  of  this  thesis  are  based  on  a  special  case  of  a  more  general  idea, 
namely  the  adaptive  use  of  interpolation  techniques  to  speed  up  l2U'ge  scale 
computations.  Other  examples  of  this  approach  include  the  use  of  wavelets  for 
the  construction  of  fast  numerical  algorithms  (see,  for  example,  [1],  [4]),  and  the 
use  of  multipole  or  Chebyshev  expansions  for  the  compression  of  certain  classes 
of  linear  operators  (see,  for  example,  [2],  [7],  [22]). 

4.  One  of  the  more  far-reaching  extensions  of  the  results  of  this  thesis  is  a  set 
of  algorithms  for  discrete  Fourier  transforms  in  several  dimensions.  In  two 
dimensions,  for  example,  we  may  wish  to  evaluate  the  sums 

N 

=  (6.3) 

k=\ 

for  each  integer  pair  (m,n),  where  the  points  (xfc.yjt)  are  generally  distributed 
in  the  plane.  A  straightforward  application  of  the  techniques  of  this  thesis  yields 
an  order 

A-logiV-|-Ar-log2  0  (6.4) 

algorithm  for  this  problem.  Special-purpose  algorithms  can  also  be  designed  for 
increased  efficiency  in  Ccises  when  the  points  {zkiVk)  he  on  a  curve.  Detailed 
investigations  into  higher  dimensional  problems  of  this  type  are  currently  in 
progress  and  will  be  reported  at  a  later  date. 


CHAPTER  6.  CONCLUSIONS  AND  GENERALIZATIONS 
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